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Not simple, but as simple as possible 


Physics should be made as simple as possible, 
but not any simpler. 


—A. Einstein 


Einstein gravity should be made as simple as possible, but not any simpler. 

My goal is to make Einstein gravity* as simple as possible. I believe that Einstein’s 
theory should be readily accessible to those who have mastered Newtonian mechanics and 
a modest amount of classical mathematics. To underline my point, I start with a review of 
F=ma. 

Seriously, what do you need to know to read this book? Only some knowledge of 
classical mechanics and electromagnetism! So I fondly imagine, perhaps unrealistically. 
More importantly, you need to be possessed of what we theoretical physicists call sense— 
physical, mathematical, and also common. 

I wrote this book in the same spirit as my Quantum Field Theory in a Nutshell.! In his 
Physics Today review of that book, Zvi Bern wrote this lovely sentence aptly capturing my 
pedagogical philosophy: “The purpose of Zee’s book is not to turn students into experts— 
it is to make them fall in love with the subject.” I might extend that to “fall in love with the 
subject so that they might desire to become experts.” Here I am echoing William Butler 
Yeats, who said, “Education is not the filling of a pail, but the lighting of a fire.” 


* Also known as general relativity. 
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A portion of this book can be used for an undergraduate course. I have done it, and I 
provide a detailed course outline later in this preface. 

Accessible is not to be equated with dumbed-down or watered-down. Also, accessible is 
not necessarily the same as elementary: in the last parts of the book, I include some topics 
far beyond the usual introductory treatment. 

My strategy to make Einstein gravity as simple as possible has two prongs. The first is the 
emphasis on symmetry. As some readers may know, I have written an entire book? on the 
role of symmetry in physics, and I absolutely love how symmetry guides us in constructing 
physical theories, a notion that started with Einstein gravity, in fact. The second is the 
extensive use of the action principle. The action is invariably simpler than the equations of 
motion and manifests the inherent symmetry much more forcefully. I can hardly believe 
that some well-known textbooks on Einstein’s theory barely mention the Einstein-Hilbert 
action. Symmetry and the action principle constitute the two great themes of theoretical 
physics. 

To get a flavor of what the book is about, you might want to glance at the recaps first; 
there is one at the end of each of the ten parts of the book. 


How difficult is Einstein gravity? 


Any intelligent student can grasp it without too much trouble. 
—A. Einstein, referring to his theory of gravity 


When Arthur Eddington returned from the famous 1919 solar eclipse expedition that 
observed light from a distant star bending in agreement with Einstein gravity, somebody 
asked him if it were true that only three people understood Einstein’s theory. Eddington 
replied, “Who is the third?” The story, apocryphal? or not, is one of many* that gives 
Einstein’s theory its undeserved reputation of being incomprehensible. 

I believe that in some cases, people like to persist in believing that Einstein’s theory is 
beyond them. A renowned philosopher who is clearly well above average in intelligence 
(and who understands things that I find impossible to understand) once told me that he 
was tired of popular accounts of general relativity and that he would like to finally learn 
the subject for real. He also emphasized to me that he had taken advanced calculus? in 
college, as if to say that he could handle the math. I replied that, for a small fee, my 
impecunious graduate student could readily teach him the essence of general relativity 
in six easy lessons. I never heard from the renowned philosopher again. I was happy and 
he was happy: he could go on enunciating philosophical profundities about relative truths® 
and physical reality. 

The point of the story is that it is not that difficult. 
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For whom is this book intended 


Experience with my field theory textbook suggests that readers of this book will include the 
following overlapping groups: students enrolled in a course on general relativity, students 
and others indulging in the admirable practice of self-study, professional physicists in 
other research specialties who want to brush up, and readers of popular books on Einstein 
gravity who want to fly beyond the superficial discussions these books (including my own’) 
offer. My comments below apply to some or all of these groups.® 

Personally, I feel special sympathy for those studying the subject on their own, as I 
remember struggling? one summer during my undergraduate years with a particularly 
idiosyncratic text on general relativity, the only one I could find in Sao Paulo back in those 
antediluvian times. That experience probably contributed to my desire to write a textbook 
on the subject. From the mail I have received regarding QFT Nut, I have been pleasantly 
surprised, and impressed, by the number of people out there studying quantum field 
theory on their own. Surely there are even more who are capable of self-studying Einstein 
gravity. All power to you! I wrote this book partly with you in mind. 

Serious students of physics know that one can’t get far without doing exercises. Some 
of the exercises lead to results that I will need later. 

Quite naturally, I have also written this book with an eye toward quantum field theory and 
quantum gravity. While I certainly do not cover quantum gravity, I hope that the reader who 
works through this book conscientiously will be ready for more specialized monographs'° 
and the vast literature out there. 

So, I prevaricated a little earlier. In the latter part of the book, occasionally you will need 
to know more than classical mechanics and electromagnetism. But, to be fair, how do you 
expect me to talk about Hawking radiation, a quintessentially quantum phenomenon, in 
chapter VII.3? Indeed, how could we discuss natural units in the introduction if you have 
never heard of quantum mechanics? For the readers with only a nodding acquaintance 
with quantum mechanics, the good news is that for the most part, I only ask that you 
know the uncertainty principle. 

I do not doubt that some readers will encounter difficult passages. That’s because I have 
not made the book “any simpler”! 

In the preface to the second edition of my quantum field theory book, I mentioned that 
Steve Weinberg and I, each referring to his own textbook, each said, “I wrote the book that 
I would have liked to learn from.” So this is the book I would have liked as an undergrad* 
eager to learn Einstein gravity. I would have liked having at least a flavor of what the latest 


* In a letter to the editors of Physics Today in 2005, A. Harvey and E. Schucking wrote that, in view of the 
“monumental lip service” paid to Einstein in the physics community, “it is a scandal” that Einstein gravity is still 
not regularly taught to undergraduates. I find it even more of a scandal that many physics professors proudly 
profess ignorance of Einstein gravity, saying that it is irrelevant to their research. Yes, maybe, but this is akin to 
being proudly ignorant of Darwinian evolution because it is irrelevant to whatever you are doing. 
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excitement was all about. In this spirit, I offer chapter X.6 on twistors, for example, trusting 
the reader to be sophisticated enough to know that all one should expect to get froma single 
textbook chapter is an entry key to the research literature rather than a complete account 
of an emerging area. 


The importance of feeling amazed, and amused 


Iam amazed that students are not amazed. 

The action principle amazed Feynman when he first heard about it. In learning theoret- 
ical physics, I was, and am, constantly amazed. But in teaching, I am amazed that students 
are often not amazed. Even worse, they are not amused. 

Perhaps it is difficult for some students to be amazed and amused when they have to 
drag themselves through miles of formalism. So this exhortation to be amazed is related 
to my attempt to keep the formalism to an absolute minimum in my textbooks and to get 
to the physics. 

To paraphrase another of my action heroes, students should be required to gasp and 
laugh" periodically. Why study Einstein gravity unless you have fun doing it? 


As much fun as possible 


Bern started his review of my quantum field theory textbook thus: 


When writing a book on a subject in which a number of distinguished texts already exist, any 
would-be author should ask the following key question: What new perspectives can I offer that 
are not already covered elsewhere? . . . perhaps foremost in A. Zee’s mind was how to make 


Quantum Field Theory in a Nutshell as much fun as possible. 


Good question! My answer remains the same. I want to make Einstein gravity as much 
fun as possible. 

Sidney Coleman, my professor in graduate school and thesis advisor, once advised me 
that theoretical physics is a “gentleman’s diversion.” I was made to understand that I 
should avoid doing long sweaty calculations. This book reflects some of that spirit. Thus, 
in chapter VI.1, instead of deriving Einstein’s field equation as a true Confucian scholar 
would, I try to get to it as quickly as possible by a method I dub “winging it southern 
California style.” Similarly, in chapter V1.2, I get to cosmology as quickly as possible. 

This invariably brings me to the dreaded topic of drudgery in general relativity. Many 
theory students in my generation went into particle physics rather than general relativity 
to avoid the drudgery of spending an entire day calculating the Riemann curvature tensor. 
I did.!? But that was the old days. Nowadays, students of general relativity can use ready- 
made symbolic manipulation programs! to do all the tedious work. I strongly urge you, 
however, to write your own programs, as I did, rather than open a can. It also goes without 
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saying that you should calculate the Riemann curvature tensor from scratch at least a few 
times to know how all the cogs fit together. 


You make the discoveries 


My pedagogical philosophy is to let students discover certain things on their own. Some 
of these lessons evolved into what I call extragalactic fables. For example, in part IV, I let 
the extragalactic version of you discover electrodynamics and gravity. In chapter IV.3, you 
discover that gravity affects the flow of time. 

I also whet your appetite by anticipating. For example, I mention the Einstein-Rosen 
bridge already in chapter I.6. In working out the shortest distance between two points in 
chapter II.2, I mention that you will encounter the same equations when you study motion 
around black holes. In part II, I note that the peculiar replacement of a simple equation by 
amore complicated looking equation foreshadows Einstein’s deep insight about gravity to 
be discussed in part V. 


The return of Confusio 


Readers of QFT Nut might be pleased to hear that Confusio makes a return appearance, 
together with other characters, such as the Smart Experimentalist. Some other friends of 
mine, for example the Jargon Guy, also show up. Here I am alluding to what Einstein 
referred to as “more or less dispensable erudition.” 


An outline of this book 


This book appears to start at a rather low level, with a review of Newtonian mechanics 
in part I. The reason is that I want to treat two topics more thoroughly than usual: 
rotations and coordinate transformations. A good understanding of these two elementary 
subjects allows us to jump to the Lorentz group and curved spacetime later. My pedagogical 
approach is to beat 2-dimensional rotations to death. Depending on how mechanics is 
taught, students typically miss, or fail to grasp, some of the material in the chapter on 
tensors. I repeat the discussion of tensors under various guises and in different contexts. 
One of my students who read the book points to various places where I appear to repeat 
myself, but I told her that it is better to hear some key point for the third time!” than not to 
have understood it at all. A respected senior colleague and pioneer in Einstein gravity said 
to me that a good teacher is someone who never says anything worth saying only once. 

I devote part II to a discussion of the all-important action principle, because I believe 
that it provides the quickest, and the most fundamental, way to Einstein gravity (and to 
quantum field theory). Part III is devoted to special relativity but, in contrast to some 
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elementary treatments, the emphasis is on geometry and completion, not on a collection of 
paradoxes. In part IV, as was mentioned earlier, I let you discover electromagnetism and 
gravity, and so the treatment is somewhat nonstandard. Thus, even if you feel that you 
already know special relativity, you might want to take a quick look at part III and part IV. 

Many readers probably pick up this book because of a burning desire to learn Einstein 
gravity. These readers would have already mastered Newtonian mechanics and special 
relativity, and they could probably cut to the chase and skip directly to part V. To them, the 
first four parts may appear to be a rather leisurely preparation for Einstein gravity. Still, I 
would counsel skimming, rather than skipping, the first four parts. At the very least, parts 
I-IV set down the conventions and notation. More importantly, they offer up the ideology 
of this text, an ideology that can be simply stated: action! 

While I appear to start slow in parts I-III, I am actually setting things up so that we can 
go fast in parts V and VI. For example, all the discussion about coordinate transformation 
and curved spaces is to prepare the reader for a quick plunge into curved spacetime in 
chapter V.1. Similarly, the action principle enables the geodesic equation to be introduced 
early on, in part II, so that it is “ready to trot’ when needed in part V. In considering 
whether to sign up for my course that grew into this book, some students ask how fast I 
will be zooming through special relativity to get to the “good stuff.” But special relativity 
is good stuff! In particular, it is essential to understand special relativity as the geometry 
of spacetime* before moving on to general relativity. 

The essence of Einstein gravity is explained in parts V and VI. The rest of the book 
contains what may be regarded as applications of the theory as developed in part VI. Part X 
contains extras that some might consider beyond the scope of an introductory text. The 
title is thus something of a misnomer, but to please my publisher, I am obliged to keep 
up a running joke I started with my field theory book. A better title might be Gravity from 
Newton to the Brane World. 


The role of appendices 


Asa textbook writer, I am torn between being concise and being complete. One way out is to 
place numerous topics in appendices to various chapters. Some are fun, such as Einstein’s 
derivation of E = mc’ in his 1946 Haifa lectures (see chapter III.6), which, unfortunately, 
is in danger of being forgotten and which I much prefer to his 1905 derivation. Another 
example is Weyl’s shortcut to the Schwarzschild solution (see chapter V1.3). Some are 
results I will need later, but often much later. For example, I talk about the speed of sound 
in an appendix to chapter III.6, but I won’t need it until I get to the cosmic microwave 
background. Some appendices are peripheral or technical. When possible, I try to give an 
intuitive and heuristic understanding before launching into a long development, such as 


* A multitude of books treat special relativity, but while they all get the job done, they differ widely in conceptual 
clarity. Besides the geometrical view of special relativity, I also want to emphasize the Lorentz action as leading 
to a unified approach to both massive and massless particles. 
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the treatment of Fermi normal coordinates. Some are for enrichment. In sum, the use of 
appendices represents my effort to appeal to a broad range of readers with enormously 
different levels of knowledge and sophistication. The reader should not feel obliged, upon 
first reading this book, to study all the appendices. Each should exercise his or her own 
judgment. 

Still, a book this size is inevitably incomplete, and so it comes down to the author’s 
choice (of course). So many beautiful results, so little space and time! I regard certain 
topics, though important, as better covered in more specialized tomes, such as gravitational 
lensing, and prefer to include some topics not discussed in several standard textbooks, such 
as anti de Sitter spacetime, brane worlds, and twistors. 


The most incomprehensible thing about some physics textbooks 


The most incomprehensible thing about the physical world is 
that it is comprehensible. 


—A. Einstein 


The most incomprehensible thing about some physics textbooks is that they are in- 
comprehensible. 

They manage to render the easily comprehensible into the nearly incomprehensible. 
Some textbook writers are simplifiers, others are what I call complicators. In defiance of 
Einstein’s exhortation, many authors strive to make physics as complicated as possible, or 
so it seems to me. In the research literature, the cause of obscurity may be unintentional 
or intentional: either the author has not understood the issues involved completely (often 
laudably so, when the author is at the cutting edge), or the author wants to impress upon 
the reader the profundity of his or her paper by resorting to obfuscations. But in a textbook? 

My task, and hope, in my textbooks is to make physics as simple as possible, as the “old 
man’ with his toy!® said. Having written both a textbook and a couple of popular books, I 
am perhaps qualified to express my opinions here. Popular books attempt to make physics 
simpler than it really is, thus in some sense deceiving the reader. Textbooks are different: 
they must make the reader work to master the subject. But making the reader work is not 
the same as making the reader suffer by rendering simple things obscure. 


No bijective maps in this book 


I am puzzled by students who profess no trouble with the physics but moan* about the 
math. All the “grown-ups” would say the opposite. The pros regard Riemannian geometry, 


* Indeed, many of the postings on the sites of online booksellers regarding general relativity texts lament the 
difficulty of the math. At the other extreme, a few, by misguided individuals in my opinion, complain about the 
lack of rigor. 
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which is after all totally logical and algorithmic, as easy, but continue to lose sleep over 
Einstein’s theory. Regarding the math, I can say, with only slight exaggeration, that mastery 
of the index notation and the chain rule almost suffices. Indeed, any serious student with 
a future in theoretical physics should be continually puzzled by the physics but not at all 
by the math. 

Einstein did not say that physics should be made simple. Of course, physics is not 
simple, and understanding Einstein’s theory does require effort. Surely you have heard 
that Einstein gravity involves curved spacetime, so there is no getting around learning 
the language needed to describe curvature. My strategy is to introduce math only when 
necessary, and then to illustrate the key concepts with plenty of examples. I dislike the Red 
Army” approach, and so I do not start by defining bundles on the tangent plane. I bring 
in the math gently and sneak in curvature early on via the familiar change of coordinates. 

As for rigor, I will let yet another of my action heroes speak. “I’ll differentiate any 
function, even the freaking delta function, as many times as | darn well please.” So if you 
have to differentiate, just differentiate until the expression you are differentiating starts 
bleating for mercy. The trick is to know when it is absolutely necessary to be rigorous 
(which is seldom—I would never say never). 

I respectfully submit that this book is not for those who want rigor. 

While I realize the need for and the benefit of precise definition, for the most part I 
simply plead membership in the Feynman!® “Shut up and calculate” school of physics.!° 
Thus, I won't trouble your sleep with assertions such as “A bijective differentiable map of 
a manifold, whose inverse is also differentiable, is called a diffeomorphism.” Regarding 
statements like this, I think that another Einstein quote may be apropos: “We should 
take care not to make the intellect our god; it has, of course, powerful muscles, but no 
personality.”*° Yet another relevant quote: “The people in Gottingen sometimes strike me, 
not as if they wanted to help one formulate something clearly, but instead as if they wanted 
only to show us physicists how much brighter they are than we.””! Alas, “the people in 
Gottingen” have now gone off and multiplied,* and some even live in our midst. Precise 
definitions are indeed necessary occasionally, but by and large, they don’t do much good 
in theoretical physics. Some things are better left undefined. In this connection, also keep 
in mind the distinction between true clarity and false clarity.” For example, I consider the 
insistence on saying “pseudo-Riemannian manifolds” in a book of this level false clarity 
at best. 

As I was putting the finishing touches on this book, I read about some notes”? Feynman 
scribbled to himself before teaching some course: “First figure out why you want the 
student to learn the subject and what you want them to know, and the method will result 
more or less by common sense.” Well said! As it turned out, that was the method I followed 
when writing this book. 

If you feel that bijection is indispensable for your existential essence, then I also respect- 
fully submit that this book is not for you. 


* One tribe is known to look at “old fashioned” indices with contempt. Only coordinate-free notations** are 
good enough for them. 
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But of course I am not against mathematics. For instance, I am all for differential forms 
(see chapters IX.7 and IX.8). However, when faced with a new formalism, I tend to be 
practical and ask, “For the time invested in learning it, what is the payoff?” How significant 
is it for the physics? 


Teaching from this book and self-studying 


It would be ideal to teach a leisurely year-long course based on this book. But I have also 
taught Einstein gravity at the University of California, Santa Barbara, as a scandalously 
short one-quarter undergraduate course consisting of only 29 lectures. The students al- 
legedly knew the action principle and special relativity, but I was appropriately skeptical. 
Here is the actual course plan. 

Lecture 1 gives an overview. Lectures 2-6 cover chapters I.5 and 1.6, starting with the 
notion of a metric and illustrated with numerous examples, including the Poincaré half 
plane, and ending with locally flat coordinates and a count of the components contained 
in the curvature tensor. Lectures 7 and 8 cover part II, and lectures 9 and 10 part II. In 
lectures 11 and 12, I let the students discover electromagnetism and gravity and derive 
how gravity affects the flow of time. Lectures 13-15 introduce the equivalence principle 
and cover part V up to chapter V.3, ending with closed, flat, and open universes. 

The second half of the course proceeds as follows: 


Lecture 16: the geodesic equation reduced to Newton’s equation, gravitational redshift, spher- 
ically symmetric spacetime with time dependence 

Lecture 17: the motion of particles and light in static spherically symmetric spacetime 
Lecture 18: covariant differentiation, the geometrical picture 

Lecture 19: to Einstein’s field equation as quickly as possible 

Lecture 20: the Riemann curvature tensor and its symmetry properties 

Lecture 21: the Einstein-Hilbert action 

Lecture 22: the cosmological constant and the expanding universe 


Lecture 23: Schwarzschild metric, with precession of planets and radar echo delay described 


in words and pictures 

Lecture 24: the energy momentum tensor 

Lecture 25: general proof of energy momentum conservation 
Lecture 26: the Einstein tensor and the Bianchi identity 
Lecture 27: black holes in various coordinates 

Lecture 28: the causal structure of spacetime 

Lecture 29: Hawking radiation and a grand review 


So it is entirely possible to cover the bulk of this book in a one-quarter course! I did it. 
Students were expected to do some reading and to fillin some gaps on their own. Of course, 
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instructors could deviate considerably from this course plan, emphasizing one topic at the 
expense of another. They might also wish to challenge the better students by assigning the 
appendices and some later chapters. 

Here I come back to those I applauded earlier for self-studying Einstein gravity. Some 
of you might want to know which chapters to read. The answer is of course that you 
should read them all, in an ideal world. But if you want to get “there” quickly, I suggest 
the following. You are on your own regarding the first three parts: it all depends on what 
you already know. So try starting with part IV and see how often you need to refer back to 
an earlier chapter. Part V is indispensable, particularly the equivalence principle and the 
tour of curved spacetimes. You need to understand the covariant derivative, but you could 
skip the somewhat heavier appendices in chapter V.6. After the covariant derivative, you 
are ready for the heart of the matter, Einstein’s field equation, in chapter V1.1. The rest 
of part VI forms the core of a traditional course on general relativity, but my emphasis 
is somewhat less on working out orbits in detail. That’s it! You would have then reached 
a certain level of mastery of Einstein gravity. You could then regard the rest of the book, 
parts VII-X, as a buffet of topics that you could browse at your leisure. Part X contains 
more speculative topics, including some that may not be of lasting value. Be warned! 
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Notes 
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19. 


. Hereafter referred to as QFT Nut. 

. A. Zee, Fearful Symmetry. Hereafter, Fearful. 

. See chapter VI.3. 

. Chaim Weizmann, the first president of Israel and a chemist, once crossed the ocean with Albert Einstein 


on the same liner, and Einstein tried to explain the theory of relativity to him. When asked about this later, 
Weizmann said something like “I did not understand his theory, but he certainly convinced me that he did.” 


. For the record, I took a philosophy course in college. To further emphasize that I am not totally lacking in 


“philosophical credentials,” I was once invited by a philosophy professor to lecture, thanks to one of my 
popular books, to an auditorium full of philosophers. I like philosophers. 


. Einstein once said that he should have called his work “invariance theory” and lamented his use of the word 


“relative.” 


. A. Zee, An Old Man's Toy. Hereafter, Toy/Universe. 
. In my introduction to Feynman’s book on quantum electrodynamics, I wrote about three different kinds of 


readers of that book. Only part 0 of this book will be comprehensible to the first kind. See R. P. Feynman, 
QED: The Strange Theory of Light and Matter, with a new introduction by A. Zee, Princeton Science Library, 
2006. 


. An undergrad friend had also deluded me into thinking that it was salutary to read Einstein in the original 


German! 


. Read J. Polchinski, String Theory, for example. 
. QFT Nut, p. 473. 
. For the record, I started my research career with John Wheeler, studying gravitational wave emission from 


neutron stars. For Wheeler’s influence on his students, see Charles W. Misner, “John Wheeler and the 
Recertification of General Relativity as True Physics,” in General Relativity and John Archibald Wheeler, ed. I. 
Ciufolini and R. Matzner, Springer, 2010. 


. See my remarks in chapter IX.9, for example. 
. A. Einstein, Autobiographical Notes, Open Court, 1999. 
. In any case, if you think that I talk too much about tensors, you could simply feel smugly superior to those 


poor souls who never get it. 


. See Toy/Universe. Also see figure 2b in the prologue to book two. 
. I learned this terminology (which, I should clarify, referred to the Russian, not the Chinese, version) in a 


conversation with Steve Weinberg about textbooks. It has something to do with lining up all the tanks first. 


. A colleague who got his doctorate at Caltech told me the following story. He was examined by a committee 


consisting of Feynman and a bunch of lesser lights. One of the lesser lights posed a question to my friend, who 
proceeded to answer it perfectly, outlining the calculation necessary and explaining the physical significance 
of the result. The lesser light then opined ominously, “You should have also said . . . ” and hereforth issued 
from his mouth a long string of highfalutin hundred-dollar words. Feynman turned to the lesser light and 
announced to the rest of the room, “But that’s exactly what he said!” 

Here is a totally gratuitous Feynman story that has nothing to do with the discussion at hand. During the 
exam, Feynman asked a question about quantum mechanics that the student was unable to answer. Feynman 
exploded, saying something like “Quantum mechanics was invented in the 1920s and it’s now 1972; you 
really should have mastered quantum mechanics by now!” A committee member turned to Feynman and 
said softly, “Dick, Dick, it’s now 1973.” 

A colleague told me his retort to Feynman: “Shut up and contemplate.” Of course, Feynman is capable of 
doing both. Contrary to myth, Feynman won the national Putnam mathematics competition. Here we are 
talking about people who can only talk and not calculate. 
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21. 


22. 


23. 


24. 


The quote is possibly apocryphal. 

Quoted in C. Reid, Hilbert, Springer, 1996, p. 142. 

As one of my professors, an exceedingly distinguished theoretical physicist, used to say, the main purpose 
of all the talk about tangent bundles and pullback is to frighten young children. This is not entirely true, but, 
oh well. 

R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, volume III, Addison Wesley 
(Commemorative issue 2004), p. xi. 

Iam certainly not against coordinate-free notations. In physics, the only issue is which notation is best suited 
for the job at hand. Coordinate-free notations are great for proving general theorems but are not so good for 
calculating. In this connection, I might regale the reader with a story. At a recent Santa Barbara conference 
on black holes, dS, AdS, gravity dual, and so on—in short, the latest hot stuff—I was chatting at lunch 
with two leading young researchers, up and coming stars, not some aging curmudgeons with congealed 
opinions. When I mentioned how some people clamored for index-free notations, one of these two leading 
lights basically said to please get those people out of her sight. The other told me a more illuminating story. 
During grad school, to deepen his understanding of Einstein gravity, he enrolled in a course taught by a 
famous mathematician. As it happened, he was the only student able to do the problems in the final exam 
involving actual calculations: he did them by first using old fashioned indices and then translating back into 
the abstract notation used in the course. 

The index-free notation in Einstein gravity is somewhat analogous to using vectors without committing 
to any specific coordinate choice. For example, one can prove easily that L=?x p is conserved, but try to 
do the spinning top on an oscillating inclined plane without setting up coordinates! The difference between 
the uninitiated and the misinformed is that the uninitiated is not acquainted with a particular formalism, 
while the misinformed insists that only the particular formalism he or she likes is any good. 


| Part o| Setting the Stage 


Plog 


Three Stories 


Story 1: The drowning beauty and the scrawny lifeguard 


Since I started my quantum field theory text! with a story, possibly apocryphal, about 
Feynman in a quantum mechanics class, I feel compelled to start this text also by telling a 
story, possibly true,” about Feynman. The movie opens on a gorgeous southern California 
beach. We zoom in on a lifeguard, noticeably scrawnier than the other lifeguards. But 
on the other hand, we soon discover that he is considerably smarter. Egads, it is Dick 
Feynman, in the days before Baywatch! Perched on his high chair, he has been watching 
an attractively curvaceous swimmer with great interest, plotting how he could win the girl’s 
affection, all the while solving a field theory problem in his head. Suddenly, he notices that 
the girl is splashing about frantically. She is going under! Must be a cramp! An action hero 
is as an action hero does: Feynman jumps down from his lookout and goes into action.* 

The other lifeguards are already proceeding in a straight line (starting from point F, the 
lifeguard station, in figure 1, going along the dotted line) toward the girl (at point G). That 
would be the path of least distance. But no, Feynman has already calculated the path that 
would allow him to reach the girl in the least amount of time. Time counts more than 
space here: least time trumps least distance. Our hero (like other humans) can run much 
faster, even on a soft sandy beach, than he can swim. So the rescuer should spend more 
time running before plunging into the sea. A simple high school level calculation (exercise 
1) shows Feynman the best path to take (see the solid line in figure 1). Our hero beats the 
other guys and gets to the eternally grateful girl first! 


* “Physics is where the action is.” See chapter III.2. 
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sand - ~. 


water M:: 


Is L . 


Figure 1 The best possible path for Feynman to follow 
to get to the drowning girl is along the solid lines 
from F to G. 


But you don’t have to calculate to see that there is an optimal path. Only a cretin would 
follow the third path (the dashed line) shown in the figure! 

In the 17th century, Fermat discovered that light, just like Feynman, also follows a least 
time principle, and as a result “bends” as it enters from one medium (say, air) into another 
(say, water). To read these very words, you have, or rather your saintly mother has, cleverly 
positioned in your eyes a blob of watery substance (known to the cognoscenti as a lens) 
that you squeeze just so, using tiny muscles, to bend light to your advantage and bring the 
ambient light bouncing off these words on the printed page into focus. Your mother, as the 
product of eons of evolution, was oh so clever, giving you eyes. As we speak (so to speak), 
you are using precisely this phenomenon of light bending to save the light entering your 
eyes some time, a phenomenon known as refraction, and to gain yourself some knowledge 
about physics and the universe—an activity evolution applauds: reading this book could 
conceivably boost your reproductive advantage. 

We all know that light travels in a straight line, but we also notice easily that when light 
enters water from air, it bends (as shown in figure 1 with “sand” replaced by “air”). Indeed, 
that explains why people standing in swimming pools appear to have comically short legs,* 
a phenomenon you can test by sticking a pencil in a glass of water. 

It has also been known ever since Euclid’ that the shortest path between two points 
is a straight line. Ergo, if light is always in a hurry to get from one point to another, it 


* If you can’t explain that, see figure 7.1 in Fearful. See also the common mirage shown in figure 7.2: on a 
hot day, the highway beneath a distant car appears to be wet, but is in fact dry. This mirage shows that light only 
cares about the local, not the global, minimum in time of transit. 

+ Babies have no need for Euclid; as soon as they can crawl, they move toward the obscure objects of their 
desire along a straight line. 
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wants to move in a straight line. Fermat and others realized that the bending of light could 
be explained if light moves more slowly in water than in air. Indeed, if light were really 
stupid, it would move in a straight line through point M to get from F to G, just like the 
other lifeguards. 


Story 2: An ant and her honey 


When I was a kid, I was challenged by a puzzle about an ant and a drop of honey. An ant 
located on the outside of a cylindrical glass of radius R and a vertical distance d below the 
rim, sees, never mind how, or perhaps smells, a drop of honey directly opposite her, but 
on the inside of the glass (see figure 2a). The ant wants to get to the honey in the shortest 
possible time,’ crawling at some constant speed. 

The solution depends on a cute trick. Imagine that the glass is made of paper. Tear out 
the bottom and cut the cylindrical glass down some vertical line. Lay the paper down flat, 
as shown in figure 2b. Further, imagine the paper to be double-sheeted, so the side with the 
drop of honey could be folded out, as shown in figure 2c. Now clearly, the path of shortest 
distance between the ant and the honey is a straight line, with distance ./ (1 R)* + (2d)?. 
The path is also indicated in figure 2b, with the segment inside the glass indicated by a 
dotted line. A really dumb ant would go up vertically to the rim of the glass, then move 
along the rim to a point above the honey, and then go down (or along a number of similar 
paths equal in distance to the one just described). 

This puzzle contains two of the themes central to this book: the shortest path between 
two points and curvature, intrinsic and extrinsic. 


honey 


ant 


Figure 2. The best possible path for the ant to follow to get to her honey. 
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Draw circles and triangles on a flat piece of paper. Then roll the paper up into a cylinder. 
The radius and circumference of a circle maintain the same value as when flat: the paper 
is neither stretched nor compressed in any way. Similarly, the three angles in the triangle 
remain the same. A cylinder has extrinsic curvature, but zero intrinsic curvature: it is 
intrinsically flat. In contrast, the sphere is intrinsically curved: there is no way to construct 
a sphere from a flat piece of paper without stretching and compressing the paper. 

The proverbial guy and gal in the street think that cylinders are curved, but you and the 
ant* know better. The uninitiated are talking about extrinsic curvature, regarding how the 
2-dimensional surface of a cylinder is embedded into an external 3-dimensional Euclidean 
space. 

Imagine a civilization of mites living on some curved surface. The mites are much 
smaller than the characteristic radius of the curvature of the surface. Once they learn 
how to measure the distance along any path (by pacing off the steps they have to take, 
for instance) they are ready for geometry. They could define the straight line between two 
points P, and P, as the path of least distance. Eventually, the mite professors of geometry 
could determine whether the world of mites is curved without getting out of their world to 
take a look. For example, with enough government funding, the professors could organize 
teams of mites to draw small circles of any desired radius by finding the set of all points a 
fixed distance from a given point P. Then they can measure the circumference of the circle 
and compute 


R= 


6 circumference 
( ) (1) 


lim - - 
radius>0 (radius)? 27 radius 


as the circle shrinks to zero. For flat space, R vanishes everywhere. Thus, a nonvanishing 
value of R gives the mites a measure of the intrinsic curvature at P—of how the geometry 
of their world differs’ from Euclid’s flat geometry. (The factor of 6 provides a convenient 
normalization to match another definition of R to be given later.) Another measure would 
be the extent that the sum of the angles enclosed by a triangle deviates from z. 

Our mites are not interested in the extrinsic curvature, since they cannot get off the 
surface to take a look. Similarly, we are only interested in the intrinsic curvature of our 
universe, not in the extrinsic curvature, since we cannot get out* of the universe to take 
a look. 


* Ants will eventually find the shortest path to food if the starting point is the location of the colony, but you 
need a whole colony of them to do so. Their trick is to lay down pheromone on the path as they go along and to 
prefer to follow paths with the stronger pheromone. It is crucial that the pheromone evaporates at some fixed rate 
and that ants often wander off the beaten paths to try out nearby paths. (Moral: wander off the beaten paths!) We 
explore this variational principle in chapter II.2. A multitude of physicists may also eventually solve the mystery 
of quantum gravity. The paths correspond to published papers, the strength of the pheromone to the prestige of 
the authors and the number of citations received, and so on and so forth. Not a perfect analogy by any means. 

T Early in the 20th century, a distinguished professor, Sir Arthur Eddington, did precisely that, defining a 
straight line by the trajectory of light. See chapter VI.3. 

+ There exist some wild speculations that our universe is embedded in a much larger spacetime, but even in 
these theories, it does not appear that their proponents can get out of our universe, at least not until after this 
book is published. See chapter X.2. 
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Story 3: Dueling thinkers 


Professor Vicious and Dr. Nasty have been at each other’s throats for decades. Theoretical 
physicists are forever fighting over who did what when. They are constantly bickering, 
telling each other (as the joke goes), “Nyah, nyah, what you did is trivial and wrong, and I 
did it first!” 

Of course, the fight for credit goes on in every field, but in theoretical physics, it is almost 
a way of life, since ideas are by nature ethereal. And the stakes are high: the victor gets to 
go to Stockholm, while the loser is consigned to the dustbin of history, a history largely 
written by the victor with the help of an army of idolaters and science writers. 

We are finally going to settle matters between Vicious and Nasty once and for all. We 
place the two of them at two ends of a long hall, Vicious at x = 0 and Nasty at x = L. 

We now tell Vicious and Nasty to solve the basic mystery of why the material world comes 
in three copies.’ As soon as they figure it out, they are to push a button in front of them. 
When the button is pushed, a pulse of light is flashed to the middle of the room where, 
at x = L/2, our experimental colleague, an electronics wiz, has set up a screen. When the 
screen detects the arrival of a light pulse, all kinds of bells and whistles are rigged to go 
off. In particular, if, and only if, two light pulses arrive at the screen at precisely the same 
instant, a huge imperial Chinese gong will be bonged. 

“Fair is fair, any and all priority claims will be settled,” we tell Vicious and Nasty. “Now 
go to work and solve the mystery of the family problem: why do quarks and leptons come 
in three sets?” The dueling duo immediately assume the Rodinesque pose of the deep 
thinker and lock themselves in a think to the death. 

Meanwhile, you are sitting on a train, moving smoothly relative to the dueling thinkers. 
Denote the time and space coordinates in your rest frame by t’ and x’. In the Newtonian 
universe, time is absolute, and so we have t’ = ¢. In your frame, Vicious and Nasty are 
moving by according to x’ = ut and x’ = L + vt, respectively, but you are sitting at x’ = 0. 
Of course, in the duelists’ frame, you are the one who appears to be moving, gliding by at 
x = —ut (see figure 3). 

Some time passes, and all of a sudden we all hear a loud bong of the gong. “The best 
possible outcome, you solved the problem simultaneously!” we exclaim joyously with 
much relief. “You guys are equally smart and you should go to Stockholm together!” 

The arrangement is electronically fool-proof. We won’t have either of them gloating, “I 
did it first!” Peace shall reign on earth. But guess what? 

A Swede is sitting next to you. He, too, heard the gong. That’s the whole point of the 
gong: you either heard it or you didn’t. It is all admissible in a court of law. Now, not only is 
the Swede on the Committee, but he also happens to be an intelligent Swede. He reasons 
as follows. 

The two thinkers are gliding by as described by x’ = vt’. When Professor Vicious pushed 
the button, she sent forth a multitude of photons surging toward the screen at the speed 
of light c. But the screen was also moving forward, away from the surging photons. Of 
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screen with 
light detector 


x=0 x=L/2 x=L 


Figure 3 Professor Vicious versus Doctor Nasty. 


course, light moves at the maximum allowed speed in the universe, and it soon catches 
up with the screen. The opposite is true for Dr. Nasty. The screen is moving toward the 
photons he sent forth. Thus, to reach the screen, his photons have less distance to cover 
than Vicious’s photons. 

Hence, reasons the Swede, for the two bunches of photons to reach the screen at the 
same time and so cause the gong to bong, the photons sent out by Vicious must have 
gotten going earlier. Thus, Vicious solved the problem first. With malicious glee, the Swede 
solemnly intones, “After Professor Vicious is awarded the Nobel Prize, she will kindly help 
us stuff Dr. Nasty into the dustbin of history!” 

As Vicious® enjoys her fleeting immortality, we bemoan or toast, as our taste might 
be, the fall of simultaneity. Nasty, trying to climb out of the dustbin, insists that he and 
Vicious had been sitting still, thinking hard, and it was the Swede who was moving. Since 
the gong had bonged, Nasty is absolutely sure that he and Vicious hit their buttons at the 
same instant and so he is entitled to half the prize, while the Swede is equally sure that 
Vicious hit her button before Nasty hit his. 

The very notion of simultaneity depends on the observer! 

Meanwhile, another Swede, also on the Committee, is moving by on another train 
described in the duelists’ frame by x = vt. You can fill in the rest. 

Young Einstein has bent the stately flow of time out of shape. Albert himself thought 
up this gedanken experiment—I have merely added a few dramatic details—showing that 
the constancy of the speed of light necessarily has to alter our notion of simultaneity in 
time. 
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In theoretical physics, we say, “Mind-boggler in, mind-boggler out!” We feed the mind- 
boggling fact that the speed of light does not depend on the observer into the wondrous 
machinery of logic and out pops another mind-boggling fact, namely that simultaneity is 
in the mind of the beholder. Making up one gedanken experiment after another, Einstein 
showed that our common sense notion of time must be modified. 


Exercises 


1 Derive Snell’s law: sin 6,,/ sin 6, = c,,/cg < 1, where c,, and c, denote the speed of light in water and in air, 
respectively. 


2 Suppose the ant is outside a hemispherical bowl and the drop of honey is inside the bowl directly across 
from her. Find the shortest distance. 


3. What happens if the ant can crawl faster on the outside of the glass than on the inside? 


Notes 


1. QFT Nut. 

2. R. P. Feynman, QED: The Strange Theory of Light and Matter, with a new introduction by A. Zee, Princeton 
Science Library, 2006. 

3. A colleague told me that this reminded him, at least superficially, of the umveg test (http://www.guidehorse 
.com/intellig.htm) for assessing intelligence in horses. 

4. lam referring to the fact that quarks and leptons come in three families. 

5. In his autobiography, Michael Faraday wrote of his conception of scientists: “My desire to escape from trade, 
which I thought vicious and selfish, and to enter into the service of Science, which I imagined made its 
pursuers amiable and liberal. . . . ” Do I detect in the word “imagined” a trace of cynical disillusion? 
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A Natural System of Units, the Cube of Physics, Being Overweight, 
and Hawking Radiation 


Planck gave us natural units 


Max Planck* is properly revered for his profound contribution to quantum mechanics. But 
he is also much loved for his second greatest contribution to physics: in a far-reaching and 
insightful paper, he gave us a natural system of units. 

Once upon a time, we used some English king’s feet to measure lengths.’ Einstein 
recognized that with the universal speed of light c, we no longer need separate units for 
length and time. Even the proverbial guy and gal in the street understand that henceforth, 
we could measure length in lightyears. 

We and another civilization, be they in some other galaxy, would now be able to agree 
on a unit of distance, if we could only communicate to them what we mean by one year 
or one day. Therein lies the rub: our unit for measuring time derives from how fast our 
home planet spins and revolves around its star. Only homeboys would know. How could 
we possibly communicate to a distant civilization this period of rotation we calla day, which 
is merely an accident of how some interstellar debris came together to form the rock we 
call home? 


* In his personal life, Planck suffered terribly. He lost his first wife, then his son in action in World War I, then 
both daughters in childbirth. In World War II, bombs totally demolished his house, while the Gestapo tortured 
his other son to death for trying to assassinate Hitler. 

+ Notions we take for granted today still had to be thought up by someone. Maxwell, in his magnum opus 
on electromagnetism, proposed that the meter be tied to the wavelength of light emitted by some particular 
substance, adding that such a standard “would be independent of any changes in the dimensions of the earth, 
and should be adopted by those who expect their writings to be more permanent than that body.” The various 
eminences of our subject could be quite sarcastic. 
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Newton’s discovery of the universal law of gravity brought another constant G into 
physics. Comparing the kinetic energy smu" of a particle of mass m in a gravitational 
potential with its potential energy -GMm/r and canceling off m, we see that the combi- 
nation* GM /c? has dimensions of length. In other words, having two universal constants c 
and G at hand allows us to measure masses in terms of our unit for length (or equivalently 
time), or lengths in terms of our unit for mass. 

Planck with his constant i made a monumental contribution to physics by noting that 
the quantum world gives us for free a fundamental set of units that physicists call natural 
units. 


Three big names, three basic principles, three natural units 


To see how, note that Heisenberg’s uncertainty principle tells us that 4 divided by the 
momentum Mc is a length. Equating the two lengths GM/c? and h/Mc, we see that the 
combination hic/G has dimensions of mass squared. In other words, the three funda- 
mental constants G, c, and fi allow us to define a mass,’ known rightfully as the Planck 
mass 


Mp = {2 (1) 


We can immediately define, with Heisenberg’s help, a Planck length 


hi [nG 
lp= == 2 
Pm aes a (2) 


and, with Einstein’s help, a Planck time 


l hG 
pata [8 @ 
Cc Cc 


Einstein, Newton, and Heisenberg—three big names, three basic principles, three 


natural units to measure space, time, and energy by. We have reduced the MLT system 
to nothing! We no longer have to invent or find some unit, such as the good king’s foot, 
to measure the universe with. We measure mass in units of Mp, length in units of Jp, and 
time in units of tp. Another way of saying this is that in these natural units, c= 1, G = 1, 
and fi = 1. The natural system of units is understood no matter where your travels might 
take you, within this galaxy or far beyond. 


Newton small, so Planck huge, and the Mother of All Headaches 


The Planck mass works out to be 10!” times the proton mass M,,. That humongous number 
101°, as we will see, is responsible for the Mother of All Headaches plaguing fundamental 


* You will learn shortly what this combination means physically. 
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physics today.” That Mp is so gigantic compared to the known particles can be traced back 
to the extreme feebleness of gravity: G is tiny, so Mp is enormous. 

As the Planck mass is huge, the Planck length and time are teeny. If you insist on 
contaminating the purity of natural units by manmade ones, tp comes out to be ~5.4 x 
10- second, the Planck length /p ~ 1.6 x 107? centimeter, and the Planck mass Mp ~ 
2.2 x 10-° gram! 

It is important to realize how profound Planck’s insight was. Nature herself, far tran- 
scending any silly English king or some self-important French revolutionary committee, 
gives us a set of units to measure her by. We have managed to get rid of all manmade units. 
We needed three fundamental constants, each associated with a fundamental principle, 
and we have precisely three! 

This suggests that we have discovered all* the fundamental principles that there are. 
Had we not known about the quantum, then we would have to use one manmade unit to 
describe the universe, which would be weird. From that fact alone, we would have to go 
looking for quantum physics. 


The cube of physics 


Here is a nifty summary of all of physics as a cube (see figure 1). Physics started with 
Newtonian mechanics at one corner of the cube, and is now desperately trying to get to 
the opposite corner, where sits the alleged Holy Grail. The three fundamental constants, 
ct, h, and G, characterizing Einstein, Planck or Heisenberg, and Newton, label the three 
axes. As we turned on one or the other of three constants (in other words, as each of these 
constants came into physics), we took off from the home base of Newtonian mechanics.! 
Much of 20th century physics consisted of getting from one corner of the cube to another. 


1 we went from Newtonian 


Consider the bottom face? of the cube. When we turned on c~ 
mechanics to special relativity. When we turned on hi, we went from Newtonian mechanics 
to quantum mechanics. When we turned on both c~! and h, we arrived at quantum field 
theory, in my opinion the greatest monument of 20th century physics. 

Newton himself had already moved up the vertical axis from Newtonian mechanics to 
Newtonian gravity by turning on G. Turning on c~!, Einstein took us from that corner to 
Einstein gravity, the main subject of this book.* All the Stiirm und Drang of the past few 
decades is the attempt to cross from that corner to the Holy Grail of quantum gravity, when 


(glory glory hallelujah!) all three fundamental constants are turned on.) 


* These days, fundamental principles are posted on the physics archive with abandon. There might be 
hundreds by now. 

T By this I mean the three laws, F = ma and so on, not including the law of universal gravitation. 

+t The corner with c~! = 0 but h 40 and G £0 is relatively unpublicized and generally neglected. It covers 
phenomena described adequately by nonrelativistic quantum mechanics in the presence of a gravitational field. 
Two fascinating experiments in this area are: (1) dribbling neutrons like basketballs, and (2) interfering a neutron 
beam with itself in a gravitational field4 

§ This statement carries a slight caveat, which we will come to in chapter VIL.3. 
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mechanics ci1—— _ Einsteinian 


mechanics 


Figure 1 The cube of physics. 


In our everyday existence, we are aware of only two corners of this cube, because these 
three fundamental constants are either absurdly small or absurdly large compared to what 
humans experience. 


The universe’s obesity index 


As the obesity epidemic sweeps over the developed countries, one government after 
another has issued some kind of obesity index, basically dividing body weight by size. 
As we have seen, for an object of mass M, the combination GM/c? is a length that can be 
compared to the characteristic size of the object. So, Nature has her own obesity index for 
any object, from electron to galaxy. Indeed, as is well known, John Michell in 1783 and the 
Marquis Pierre-Simon Laplace in 1796 pointed out that even light cannot escape from an 
object excessively massive for its size. 

More precisely, consider an object of mass M and radius R. A particle of mass m at the 
surface of this object has a gravitational potential energy -GMm/R and kinetic energy 
;mv*. Equating these two energies gives the escape velocity Vescape = V2GM/R. Setting 
Vescape to c tells us that if 2GM > Rc’, not even light can escape, and the object is a black 
hole.* Remarkably, even though the physics behind the argument” is not correct in detail 
(as we now know, we should not treat light as a Newtonian corpuscle with a tiny mass), this 


* This often cited Newtonian argument actually does not establish the existence of black hole defined as an 
object from which nothing could escape. The escape velocity refers to the initial speed with which we attempt to 
fling something into outer space. In the Newtonian world, we could certainly escape from any massive planet in 
a rocket with a powerful enough engine. 
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Figure 2 A plotof M versus R for various objects in the universe. EW stands for electroweak 
and GUT for grand unified theory. The shaded area represents the “black hole” regime 
with 2GM > R. 


criterion, including the factor of 2, turns out to hold in Einstein’s theory. Figure 2 shows 
a plot of M versus R for various objects in the universe. 


Hawking radiation 


Unless you have been hiding out in the jungles of New Guinea, you would have heard that 
in an extremely influential paper, Stephen Hawking, building on the earlier work of Jacob 
Bekenstein and others, and working in collaboration with Gary Gibbons, pointed out this 
purely classical argument needs to be amended when quantum effects are included: black 
holes evaporate and radiate particles. 

In fact, the temperature of the radiation, known as the Hawking temperature Ty of the 
black hole, can be estimated by using dimensional analysis. You may be puzzled,* since 
there are two masses in the problem, the mass M of the black hole and the Planck mass 
Mp. With two masses, any function of M/Mp is admissible, and so dimensional analysis 
appears to be inapplicable. Indeed, we need one more piece of information. The key is that 


* T was talking to a distinguished condensed matter physicist just the other day, and he was puzzled about 
precisely this point. So your unspoken question may be widespread. 
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Newton’s constant G is a multiplicative measure of the strength of gravity. In Einstein’s 
theory as well as in Newton’s, the gravitational field around an object of mass M can only 
depend on the combination of GM. Let us now set c and h (but not G) to 1. The combination 
GM is a length and hence an inverse mass. On the other hand, Boltzmann and the 
founding fathers of statistical mechanics had long ago revealed to us that temperature, 
a highly mysterious concept at one time, is merely the average energy® of the microscopic 
constituents of macroscopic matter. Hence temperature has the dimensions of energy, that 
is, of a mass in units with c= 1. 

It follows immediately that T}; ~ ow: This “sophisticated” dimensional analysis cap- 
tures an essential piece of physics: the radiation is explosive! As the black hole radiates 
energy, M goes down and 7;; goes up, and thus the black hole radiates faster. The radiative 
mass loss accelerates. Certainly not something you want to see in the kitchen: an object 
that gets hotter as it loses energy. 

In chapter VII.3, we will see that the overall numerical constant can be determined in a 
couple of lines of algebra, so that 


= he} 
~ 81tGM 


(4) 


Ty 


We have restored c and h by high school dimensional analysis using everyday unnatural 
units. It is gratifying to see that indeed, with i = 0 and quantum effects turned off, Ty; = 0, 
and the black hole does not radiate. 

Thermodynamics states that entropy S is given by dE = TdS. Here E is just the mass 
of the black hole. Integrating se = Ts ~ GM, we obtain 


2 
S~GM*~ (+) (5) 
P 


Note that, as expected, S is dimensionless. 
Using the fact that the black hole has radius R ~ GM and hence surface area A ~ R2, 
we conclude that 


Pre (6) 


You should be shocked, shocked, shocked. Most theoretical physicists were, and are. 

Not shocked? 

Normally, the entropy of a system is extensive, that is, proportional to its volume. 
Somehow, a black hole has an entropy proportional to its surface area rather than to 
its volume. This fact has led to the so-called holographic principle. Many fundamental 
physicists believe that this mysterious property of black holes holds the key to quantum 
gravity. 

All of this merely from dimensional analysis! 
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Notes 


1. Some readers might wonder why we do not use the mass of the electron m,. In modern particle physics, 
the electron may not always have had the mass it has now, and in fact it might have been massless in the 
early universe. The masses of elementary particles depend on quantum field theoretic notions known as 
spontaneous symmetry breaking and the Higgs mechanism. We should express m, in terms of Mp, not Mp 
in terms of m,. In different areas of physics, different units are used: for example, the size of the hydrogen 
atom might be used as a length unit. 

2. I return to this problem in due course, in chapter X.8, for example. 

3. This face, regarded as a square, was discussed in the very first section of the first chapter in QFT Nut. 

4. See appendix 5 to chapter X.8; for more details, see J. J. Sakurai and J. Napolitano, Modern Quantum 
Mechanics, pp. 110 and 133. 

5. Named by John Wheeler almost 200 years later. 

6. The Boltzmann constant k, which is merely a conversion factor between energy units and the markings on 
some tubes containing mercury known as degrees, has been set to 1. 
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Relativity Is an Everyday and Ancient Concept 


Butterflies will fly indifferently toward every side 


Relativity is all about the notion that you are as good as the next guy, or to put it relatively, 
the other guy is as good as you. 

Mote seriously, relativity expresses the fact that the laws of physics as deduced by two 
observers in uniform motion with respect to each other must be the same. 

We physicists believe in the fundamental principle that physics should not depend on 
the physicist, unlike some other academic disciplines we need not name, in which the 
truth can vary according to the practitioner. 

The proverbial guy in the street thinks that relativity started with Albert Einstein (1879- 
1955), but you know better, of course. Surely, some smart human had an inkling of it as 
soon as sufficiently smooth transport* became available, perhaps even the proverbial “cave 
man” drifting downriver on a log watching his buddies moving by. Galileo Galilei (1564— 
1642) first! explicitly stated the principle of relativity. In Dialogue Concerning the Two Chief 
World Systems (first published in 1632) the character Salviati says: 


Shut yourself up with some friend in the main cabin below decks on some large ship, and 
have with you there some flies, butterflies, and other small flying animals. Have a large bowl 
of water with some fish in it; hang up a bottle that empties drop by drop into a wide vessel 
beneath it. With the ship standing still, observe carefully how the little animals fly with equal 
speed to all sides of the cabin. The fish swim indifferently in all directions; the drops fall into the 


* Of course, we are on a spinning rock orbiting a star in a rotating galaxy hurtling toward its neighbor at high 
speed, but our transport is so smooth that we didn’t notice it for the longest time. 
Ora Sung dynasty poet in a boat; see Fearful, p. 52. 
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vessel beneath; and, in throwing something to your friend, you need throw it no more strongly 
in one direction than another, the distances being equal; jumping with your feet together, you 
pass equal spaces in every direction. When you have observed all these things carefully (though 
doubtless when the ship is standing still everything must happen in this way), have the ship 
proceed with any speed you like, so long as the motion is uniform and not fluctuating this way 
and that. You will discover not the least change in all the effects named, nor could you tell from 
any of them whether the ship was moving or standing still. In jumping, you will pass on the 
floor the same spaces as before, nor will you make larger jumps toward the stern than toward 
the prow even though the ship is moving quite rapidly, despite the fact that during the time 
that you are in the air the floor under you will be going in a direction opposite to your jump. In 
throwing something to your companion, you will need no more force to get it to him whether 
he is in the direction of the bow or the stern, with yourself situated opposite. The droplets will 
fall as before into the vessel beneath without dropping toward the stern, although while the 
drops are in the air the ship runs many spans. The fish in their water will swim toward the 
front of their bowl with no more effort than toward the back, and will go with equal ease to bait 
placed anywhere around the edges of the bowl. Finally the butterflies and flies will continue 
their flights indifferently toward every side, nor will it ever happen that they are concentrated 
toward the stern, as if tired out from keeping up with the course of the ship, from which they 
will have been separated during long intervals by keeping themselves in the air. And if smoke is 
made by burning some incense, it will be seen going up in the form ofa little cloud, remaining 
still and moving no more toward one side than the other. The cause of all these correspondences 
of effects is the fact that the ship’s motion is common’ to all the things contained in it, and to 
the air also. That is why I said you should be below decks; for if this took place above in the 
open air, which would not follow the course of the ship, more or less noticeable differences 


would be seen in some of the effects noted.2 


That’ is so beautifully stated! Much better than most popular physics books on the 
market (see figure 1). 

Galileo’s ship was updated to Einstein’s train* and later to rocket ships and other space 
vehicles. Let’s use Einstein’s train, moving smoothly along the x-axis with velocity u (see 
figure 2). Let an event occur at the point (x, y, z) at time t for the observer on the train (call 
her Ms. Unprime) and at the point (x’, y’, z’) at time ¢’ for the observer on the ground (Mr. 
Prime). We are of course utilizing the profound and brilliant insight of Galileo’s contem- 
porary René Descartes (1596-1650) that geometry can be reduced to algebra by associating 
three numbers with each point in space. The Galilean transformation states that 


= (1) 


* The phrase “common to all the things contained in it” will play a starring role when we get to Einstein’s 
equivalence principle, as we will see in part V. 

+ Galileo intended this passage as a refutation of the argument that the earth could not rotate since otherwise 
objects would fall toward the west. 

+ The historian Peter Galison has pointed out that in the period leading up to 1905, the year Einstein proposed 
his theory of special relativity, high speed trains and the telegraph linked the cities of Europe, and an increasingly 
technological society was preoccupied with clock synchronization among other things. 
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Figure 1 Galileo’s vision: butterflies fly normally in 
a cabin on a smoothly moving ship. 


x >x’ 
; zZ 
Z 
Figure 2 Galilean transformation. 
and 
x’=x+ut, y=y, and z=z (2) 


with u the constant relative velocity between the two observers. 
dx! _ dx! _ 
dt!” dt ~~ 
with speed v, Mr. Prime sees the ball moving forward with speed v’ = v + u, in accordance 


We simply differentiate: a + u. Thus, if Ms. Unprime tosses a ball forward 


with everyday observation, as known to you, me, and Salviati. We have derived the Galilean 
law* for the addition of velocities: 
v=vut+u (3) 
> ° u : ? 
Differentiating again, we obtain the ball’s acceleration a’ = 4 = 4" = a. Since Newton’s 
law of motion F = ma involves acceleration, we conclude that Newtonian mechanics is 
invariant under the Galilean transformation, as Salviati told us. 


* Which you can verify these days at any major airport with a moving sidewalk. 
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Special relativity in one minute 


Special relativity can be simply summarized. (Of course, we will be going through it 
in much greater detail later.) Maxwell’s laws of electromagnetism turned out not to be 
invariant under the Galilean transformation. The speed of light c is determined by how 
fast an electric field can turn into a magnetic field and vice versa and so does not depend 
on the observer. In total defiance of (3), Maxwell had 


cActu (4) 


In the high noon showdown between Maxwell and Galileo, Maxwell won. The Galilean 
transformation had to be replaced by the Lorentzian transformation involving that univer- 
sal constant of Nature, c for celeritas.* The relations (1) and (2) between space and time 
were modified. 


General relativity in 30 seconds 


That was special relativity in 60 seconds. But then we could ask, what would happen if u 
were not constant, if Salviati’s ship encountered a storm, as it were? In deriving a’ = a, we 
used du = 0, but if that were not so, we would have 


ad=a+— (5) 


Multiply this by m, the mass of the ball Ms. Unprime tossed forward, to obtain ma’ = 
ma + m4. Mr. Prime, invoking Newton’s law, thus sees an additional force m@ acting 
on the ball. 

What could that force possibly be? The answer to that question will lead us to curved 


spacetime and Einstein gravity.4 


Truth is not relative 


Later in life, Einstein moaned that he should have called his work “invariant theory” instead 
of “relativity theory.” Had he been more judicious in his choice of words, you, I, and 
Einstein would have been spared the spectacle of eminent humanities scholars asserting 
that “Truth is relative” since “There is no absolute truth: Einstein proved it so.” Of course, 
you know that Einstein said exactly the opposite. Physics must be invariant and true. 


Notes 


1. Perhaps some historian will track down others before Galileo. 

2. Galileo, Dialogue Concerning the Two Chief World Systems, trans. S. Drake, University of California Press, 
1953, pp. 186-187. 

3. See P. Galison, Einstein’s Clocks, Poincaré’s Maps: Empires of Time, W. W. Norton, 2004. 

4. Nitpickers, please! It’s what I could say in 30 seconds! 


* Einstein used V in his 1905 paper. 
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From Newton to the Gravitational Redshift 


Part | | From Newton to Riemann: Coordinates to Curvature 


Newton’s Laws 


The foundational equation of our subject 


For in those days | was in the prime of my age for invention 
and minded Mathematicks & Philosophy more than at any time 
since. 


—Newton describing his youth in his memoirs 


Let us start with one of Newton’s laws, which curiously enough is spoken as F = ma but 
written as ma = F. Fora point particle moving in D-dimensional space with position given 


by X(t) = (x(t), x2(t), +++, x?(t)), Mr. Newton taught us that 
dx! . 
= F' 1 
arr) (1) 
with the index* i=1,---, D. For D <3 the coordinates have traditional “names”: for 


example, for D = 3, x!, x”, x? are often called, with some affection, x, y, z, respectively. 


Bad notation alert! In teaching physics, I sometimes feel, with only slight exaggeration, 
that students are confused by bad notation almost as much as by the concepts. I am using 
the standard notation of x and t here, but the letter x does double duty, as the position of the 
particle, which more strictly should be denoted by x‘ (t) or x(t), andas the space coordinates 
x', which are variables ranging from —oo to oo and which certainly are independent of r. 

The different status between x and ¢ in say (1) is particularly glaring if N > 1 particles 


a 2x 
are involved, in which case we write moa 7 =F : o “a — F, with x!,(t) for a= 
1,2,---, N. But certainly ¢, is a meaningless concept in Newtonian physics. In the 


Newtonian universe, ft is the time ticked off by a universal clock, while x,(t) is each 
particle’s private business. We will have plenty more to say about this point. Here x’, (t) 
are 3N functions of r, but there are still only 3 x’. 


* See appendix 2. 
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Some readers may feel that I am overly pedantic here, but in fact this fundamental 
inequality of status between x and t will come to a head when we get to the special theory 
of relativity. (I now drop the arrow on x.) Perhaps Einstein as a student was bothered by 
this bad notation. One way to remedy the situation is to use q (or g,) to denote the position 
of particles, as in more advanced treatments. But here I bow to tradition and continue to 
use x. 


Have differential equation, will solve 


After Newton’s great insight, we “merely” have to solve some second order differential 
equations. 

To understand Newton's fabulous equation, it’s best to work through a few examples. (I 
need hardly say that if you do not already know Newtonian mechanics, you are unlikely to 
be able to learn it here.) 

A priori, the force F' could depend on any number of things, but from experience we 
know that in many simple cases, it depends only on x and not on ¢ or a As physicists 
unravel the mysteries of Nature, it becomes increasingly clear that fundamental forces 
are derived from an underlying quantum field theory and that they have simple forms. 
Complicated forces often merely result from some approximations we make in particular 
situations. 


Example A 


A particle in 1-dimensional space tied to a spring oscillates back and forth. 
The force F is a function of space. Newton’s equation 

dx 

—~— = —kx 2 

We (2) 


is easily solved in terms of two integration constants: x(t) =a cos wt + bsin wt, with 
wo=,/ x, The two constants a and b are determined by the initial position and initial 


velocity, or alternatively* by the initial position at t = 0 and by the final position at some 
time t = T. Energy, but not momentum, is conserved. 


Example B 


We kick a particle in 1-dimensional space at t = 0. 

The force F is a function of time. This example allows me to introduce the highly useful 
Dirac! delta function, or simply delta function.* By the word “kick” we mean that the 
time scale t during which the force acts is much less than the other time scales we are 


* See part II. 
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= 


t 
TOT t— 


Figure 1 The delta function, which could 
be thought of as an infinitely sharp spike, 
is strictly speaking not a function, but the 
limit of a sequence of functions. 


interested in. Thus, take F(t) = w6(t), where the function 6(t) rises sharply just before 
t = 0, rapidly reaches its maximum at t = 0, and then sharply drops to 0. Because we 
included a multiplicative constant w, we could always normalize 6(t) by 


fa 8(t)=1 (3) 


As we will see presently, the precise form of 5(t) does not matter. For example, we could 
take 5(t) to rise linearly from 0 at t = —t, reach a peak value of 1/7 at t = 0, and then fall 
linearly to 0 att = t. For t < —t and fort > 7, the function 6(r) is defined to be zero. Take 
the limit t > 0, in which this function is known as the delta function. In other words the 
delta function is an infinitely sharp spike. See figure 1. 

The 6 function is somehow treated as an advanced topic in mathematical physics, but in 
fact, as you will see, itis an extremely useful function that I will use extensively in this book, 
for example in chapters II.1 and III.6. More properties of the 6 function will be introduced 
as needed. 

Integrating 

dx 


w 
ee 4 
ma (4) 


from some time t_ < 0 to some time f, > 0, we obtain the change in velocity v = 4: 


v(t,) — v(t_) = = (5) 


Note that in this example, neither energy nor momentum is conserved. The lack of 
conservation is easy to understand: (4) does not include the agent administering the kick. In 
general, a time-dependent force indicates that the description is not dynamically complete. 
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Example C 


A planet approximately described as a point particle of mass m goes around its sun of mass 
M>m. 

This is of course the celebrated problem Newton solved to unify celestial and terrestrial 
mechanics, previously thought to be two different areas of physics. His equation now reads 


m— =—-GMm— (6) 


where we use the notation 7 = (x, y, z) andr = Vx -X¥ = \/x2 + y2 4 22, 

John Wheeler has emphasized the interesting point that while Newton’s law (1) tells us 
how a particle moves in space as a function of time, we tend to think of the trajectory of 
a particle as a curve fixed in space. For example, when we think of the motion of a planet 
around the sun, we think of an ellipse rather than a spiral around the time axis. Even in 
Newtonian mechanics, it is often illuminating to think in terms of a spacetime picture 


rather than a picture in space.’ 


Newton and his two distinct masses 


By thinking on it continually. 


—Newton (reply given when 
asked how he discovered 
the law of gravity) 


Conceptually, in (6), m represents two distinct physical notions of mass. On the left hand 
side, the inertial mass measures the reluctance of the object to move. On the right hand 
side, the gravitational mass measures how strongly the object responds to a gravitational 
field. The equality of the inertial and the gravitational mass was what Galileo tried to verify 
in his famous apocryphal experiment dropping different objects from the Leaning Tower 
of Pisa. Newton himself experimented with a pendulum consisting of a hollow wooden 
box, which he proceeded to fill with different substances, such as sand and water. In our 
own times, this equality has been experimentally verified*:> to incredible accuracy. 

That the same m appears on both sides of the equation turns out to be one of the 
greatest mysteries in physics before Einstein came along. His great insight was that this 
unexplained fact provided the clue to a deeper understanding of gravity. At this pane all 
we care about this mysterious equality is that m cancels out of (6), so that 7 = —« 5 with 
kK=GM. 


Celestial mechanics solved 


Since the force is “central,” namely it points in the direction of 7, a simple symmetry 
argument shows that the motion is confined to a plane, which we take to be the (x-y) 
plane. Set z = 0 and we are left with 
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K=-cx/r> and j=-x«y/r3 (7) 


[have already, without warning, switched from Leibniz’s notation to Newton’s dot notation 
x=— and ¢=— (8) 


Since this is one of the most beautiful problems* in theoretical physics, I cannot resist 
solving it here in all its glory. Think of this as a warm-up before we do the heavy lifting 
of learning Einstein gravity. Also, later, we can compare the solution here with Einstein’s 
solution. 

Evidently, we should change from Cartesian coordinates (x, y) to polar coordinates 
(r, 8). We will do it by brute force to show, in contrast, the elegance of the formalism 
we will develop later. Differentiate 


x=rcos@ and y=rsin@g (9) 
twice to obtain first 
t=fcos@—rsindé and y=rsin6+rcosdd (10) 


and then 
¥ =F cos — 27 sind 6 —rcos@ 6*—r sind 6 

and j= sin@ +27 cosd6—rsin@ 6*+rcos0d (11) 
(Note that in each pair of these equations, the second could be obtained from the first by 
the substitution 6 > 6 — 4, so that cos 6 > sin 6, and sin 6 > — cos 6.) 

Multiplying the first equation in (7) by cos 6 and the second by sin 6 and adding, we 
obtain, using (11), 

. 42 K 

r—-—roe=— a (T7) 


On the other hand, multiplying the first equation in (7) by sin 6 and the second by cos 6 
and subtracting, we have 


276+7r6 =0 (13) 


I remind the reader again that we are doing all this in a clumsy brute force way to show 
the power of the formalism we are going to develop later. 
After staring at (13) we recognize that it is equivalent to 


d 24\ _ 
ae 9 =9 (14) 


which implies that 


to | ™ 


6=— (15) 


r 


for some constant /. Inserting this into (12), we have 


K _ _ dv(r) 
ere (16) 


P 
a 
r3 


30 | I. From Newton to Riemann: Coordinates to Curvature 


where we have defined 


2 
K 
as ae 7 

©) 2r2 or (17) 
Multiplying (16) by * and integrating over t, we have 

/ at 47? = i dt #¥ = i} ee = / PO 

2 dt dt dr dr 

so that finally 

1 +2 

ri + u(r) =€ (18) 


with € an integration constant. 

This describes a unit mass particle moving in the potential v(r) with energy e. Plot u(r). 
Clearly, if € is equal to the minimum of the potential vpin = — es then * = 0 and r stays 
constant. The planet follows a circular orbit of radius /*/«. If € > Umin the orbit is elliptical, 
with r varying between ry, (perihelion) and r,,,, (aphelion) defined by the solutions to 
€ = v(r). For € > 0 the planet is not bound and should not even be called a planet. 

We have stumbled across two conserved quantities, the angular momentum / and the 
energy € per unit mass, seemingly by accident. They emerged as integration constants, 
but surely there should be a more fundamental and satisfying way of understanding 
conservation laws. We will see in chapter I1.4 that there is. 


Orbit closes 


One fascinating apparent mystery is that the orbit closes. In other words, as the particle 
goes from rmin tO Pmax and then back to rpin, 9 changes by precisely 27. To verify that this 
is so, solve (18) for * and divide by (15) to obtain an = +(r?/1)./2(€ — v7). Changing 
variable from r to u = 1/r, we see, using (17), that 2(€ — v(r)) becomes the quadratic 


polynomial 2¢ — /?u? + 2«u, which we can write in terms of its two roots as 1?(umax — 
u)(U — Umin). Since u varies between Uypin and Umax, We are led to make another change 
of variable from u = Umin + (Umax — Umin) Sin’ ¢ to €, so that ¢ ranges from 0 to 5. Thus, 
as the particle completes one round trip excursion in r, the polar angle changes by (note 
that min = 1/Tmax aNd Umax = 1/7 min) 


Apes oe ddr oF Umax du 
Peni r2./2(e — v(r)) Umin WW 2€ — [2u2 + 2ku 
Umax as 
=2 ae =4 [ae =2n (19) 
Umin Vmax — u)(u ral Umin) 0 


That this integral turns out to be exactly 27 is at this stage nothing less than an apparent 
miracle. Surely, there is something deeper going on, which we will reveal in chapter 1.4. 
Note also that the inverse square law is crucial here. Incidentally, the change of variable 
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here indicates how the Newtonian orbit* (and also the Einsteinian orbit, as we will see in 
part VI) could be determined. See exercise 2. 

Bad notation alert! In (1), the force on the right hand side should be written as F’(x(t)) 
(in many cases). In C, the gravitational force exists everywhere, namely F(x) exists as a 
function, and what appears in Newton’s equation is just F (x) evaluated at the position of 
the particle x(t). In contrast, in A, with a mass pulled by a spring, F(x) does not make 
sense, only F(x (t)) does. The force exerted by the spring does not pervade all of space, and 
hence is defined only at the position of the particle x(t), not at any old x. I can practically 
hear the reader chuckling, wondering what kind of person I could be addressing here, but 
believe me, I have encountered plenty of students who confuse these two basic concepts: 
spatial coordinates and the location of particles. I may sound awfully pedantic, but when we 
get to curved spacetime, it is often important to be clear that certain quantities are defined 
only on so-called geodesic curves, while others are defined everywhere in spacetime. 


A historical digression on the so-called Newton’s constant 


Wouldn’t we be better off with the two eyes we now have plus a 
third that would tell us what is sneaking up behind? . . . With six 
eyes, we could have precise stereoscopic vision in all directions 
at once, including straight up. A six-eyed Newton might have 
dodged that apple and bequeathed us some levity rather than 
gravity. 

—George C. Williams” 


Physics textbooks by necessity cannot do justice to physics history. As you probably know, in 
the Principia, Newton (1642-1727) converted his calculus-based calculations to geometric 
arguments,® which most modern readers find rather difficult to follow. Here I want to 
mention another curious point: Newton never did specifically define what we call his 
constant G. What he did with ma = GMm/r? was to compare the moon’s acceleration 
with the apple’s acceleration: dmoonRimar orbit = G Mearth = “apple Radius of earth” BUt to write 
G Mearth = BappicRe, dius of earth’ U¢ had to prove what is sometimes referred to as the first of 
Newton’s two “superb theorems,” namely that with the inverse square law the gravitational 
force exerted by a spherical mass distribution acts as if the entire mass were concentrated 
in a point at the center of the distribution. (See exercise 4.) Even with his abilities, Newton 
had to struggle for almost 20 years, the length of which contributed to the bitter priority 
fight he had with Hooke on the inverse square law, with Newton claiming that he had the 
law a long time before publication. You should be able to do it faster by a factor of ~10* as 
an exercise. 


* On the old one pound note, a portrait of Newton together with his orbits appears on the back. Amusingly, 
the artist felt compelled to put the sun at the center, rather than one of the foci, of the ellipse. 
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Knowing the moon’s period and Riynar orbit, Newton could calculate dpoon. Since 


Readius of earth Had been known since antiquity, he was thus able to calculate a and 


apple 


obtained agreement* with Galileo’s measurement of a This of course represents one 


apple: 
of the most magnificent advances in physics history, with Newton unifying? the previously 


disparate subjects of celestial and terrestrial mechanics in one stroke. I don’t have space 
to dwell on this here, but I do want to call your attention to the fact that Newton did not 
need to know G and Ma, to perform his feat. 

Indeed, G was not measured until 1798 by Henry Cavendish (1731-1810) using equip- 
ment built and designed by his friend John Michell (1724-1793), now of black hole fame, 
who died before he could carry out the experiment. 

Needless to say, what I have presented here should only be regarded as a comic book 
version of history. 


Appendix 1: Where is hell? 


You will find it in this appendix, sort of. 

Curiously, contrary to what some textbooks and popular books stated, Cavendish’s goal was not to measure 
G, but Meariy and hence the earth’s density. Why this was of more interest to physicists of the time than G is in 
itself another interesting tidbit in physics history. 

I mentioned that Newton had two superb theorems and that the first triggered his feud with Hooke. His second 
superb theorem states that there is no gravitational force inside a spherical shell.!° Are you curious why Newton 
would even attack such a problem? An erroneous calculation had convinced him that the earth was much less 
dense than the moon, which led his friend Edmond Halley (1656-1742), who by the way published the Principia 
at his expense, to propose the hollow earth theory.!! Witness the popularity of the idea in science fiction, notably 
Jules Verne’s Journey to the Center of the Earth (1864). The idea may seem absurd to us, but at that time, a location 
for hell had to be found, and leading physicists gave serious thought to this pressing problem. Every epoch in 
physics has its own top ten problems. 

So now we understand Cavendish’s interest in Meay, and hence in the density of the earth rather than in G. 
Some textbooks give the impression that people easily obtained Meant by multiplying the density of rock and the 
volume of the earth. Not so easy if you think that the earth might be hollow! We learn from Newton’s second 
theorem that there is no gravitational force in hell, so the usual portrayal of the leaping flames can’t be right! 


Appendix 2: Fear of indices 


Occasionally, a student or two would profess, unaccountably, a “fear of indices.” In fact, there is nothing to 
fear.'? At this stage, just stand back and admire how clever the invention of indices is. Instead of giving names 
to each coordinate axis, such as x, y, and z, we could pass fluidly between different dimensions by writing xi, 
with i = 1, 2,---, D. The length of the alphabet we use does not limit us, and we could easily go beyond 26 
dimensions. 

When we get to Einstein’s theory, there will be a flood of indices, and we will have to distinguish between 
upper and lower indices. In Newtonian mechanics, there is no significance to whether we write the index as a 
superscript or a subscript. Have no fear: we will discuss each of these features of indices when the need arises. 
At this point, we merely note that a quantity can carry more than one index. In the text, we wrote x',, with 
i=1,2,---, D labeling the different spatial directions, and a = 1, 2,---, N labeling the different particles. We 
will encounter more examples as we go along. 


* Newton’s first try did not lead to excellent agreement, because the value for the earth’s equatorial radius was 
off. Just a reminder that physics never progresses as smoothly as textbooks say. 
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With only slight exaggeration, we could say that the invention of indices represents one of the really clever 
ideas!> in the history of physics and mathematics, almost a “magic trick” that enables us to deal with as many 
particles in as many spatial dimensions as we like with the mere addition of some subscripts and superscripts. 


Exercises 


1 Show that for some suitably smooth function f(x), the integral fox dx8(x) f(x) = f (0). Then show that 
5(ax) = 5(x)/|a| by evaluating the integral f nan dx8(ax) f (x) for some smooth function f (x). 


2 Determine the orbit r(0) by changing variable from r to u = 1/r. We will need the result of this exercise later. 


3. Newton thought that light consists of “corpuscles.” Calculate the deflection of light by the sun, applying what 
you learned in the text to the case € > 0. Note that the mass of these minute “particles of light” drops out in 
Newtonian theory anyway. We will need this result to compare with Einstein’s theory later in chapter VI.3. 


4 Prove Newton’s first superb theorem: the gravitational force exerted by a spherical mass distribution acts as 
if the entire mass were concentrated in a point at the center of the distribution. 


5 Prove Newton’s second superb theorem. 


6 Suppose engineers can build a straight tunnel connecting two cities on earth. Then we could have a free 
unpowered “gravity express”!* by simply dropping a railroad car into the tunnel, allowing it to fall from one 
city to the other. Use Newton’s two superb theorems to calculate the transit time. 


Notes 


1. Also introduced by Cauchy, Poisson, Hermite, Kirchoff, Kelvin, Helmholtz, and Heaviside. See J. D. Jackson, 
Am. J. Phys. 76 (2008), pp. 707-709. 

2. Rigorous mathematicians go berserk at physicists’ use of the word “function” here; they prefer to call it a 
distribution, defined as the limit of a function. But working physicists do not give a flying barf about such 
niceties. In any case, I do not personally know a theoretical physicist suffering any harm by calling 3(r) a 
function. 

3. Consider a game of tennis. Compare a hard drive down the line and a soft lob high over the net. In both 
cases, we are to solve Newton’s law dx =U; ay = —g, with the boundary conditions x (0) = 0, x(T) = L, and 

y(0) = y(T) = 0. (The problem is so elementary that we won’t bother to explain the notation, that y denotes 

the vertical direction, that y = 0 is the ground, that T is the time of flight before the ball hits the ground, 

that L is the length of the tennis court, and so on and so forth. You might want to draw your own figure.) 

The solution is x = Lt/T, y = $g(T — 1)t. Note that the two types of tennis shots are governed by the same 

equation and the same L. Hence we obtain the same solution, but keep in mind that T is small in the case 

of the hard drive and that T is large in the case of the soft lob. Now eliminate ¢ to obtain y as a function 
of x, namely y(x) = 3g77(1— +)#, a parabola in both cases (of course). But compare the curvature of the 
two parabolas: we have ay = —g(T/L)*, very small in the case of the hard drive (small 7) and very large in 
the case of the lob (large 7). The hard drive down the line barely skimming over the net, and the soft lob 
climbing lazily high up into the sky, look and feel totally different pictured in space. In contrast, consider y 
as a function of t. We also have two parabolas (of course), namely y(t) = ; g(T — t)t, as given earlier. Now 
compare the curvature of the two parabolas: we have ay = —g, the same in both cases. The curvature of the 
ball’s trajectory in spacetime is universal (universal gravity, get it?). But we tend to see in our mind’s eye the 
two parabolas y(x) in space, one for the hard drive and one for the lob, which look quite different, rather than 
the parabolas y(t) in spacetime, which have the same curvature. I learned this long ago from John Wheeler. 

4. Currently to one part in 1013. The modern round of experiments started with Lorand Eotvés in 1885 and 

continues with the Eét-Wash experiment led by E. Adelberger in our days. 
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13. 


14. 


. The equality of the gravitational and inertial mass of the neutron has also been verified to good accuracy 


using neutron interferometry. 


. For Newton’s letter to Halley about Hooke on the inverse square, see P. J. Nahin, Mrs. Perkins’s Electric Quilt, 


Princeton University Press, 2009. 


. G.C. Williams, The Pony Fish’s Glow, Basic Books, 1997, p. 128. 

. S. Chandrasekhar, Newton’s Principia for the Common Reader, Oxford University Press, 2003. 

. Fearful, pp. 74-75. 

. For a popular account, see Toy/Universe. 

. N. Kollerstrom, “The Hollow World of Edmond Halley,” J. Hist. Astronomy 23 (1992), p. 185. 

. Surely most readers are familiar with indices. My son the biologist informs me that even biologists use indices 


routinely; for example, on p. 20 of Genetics and Analysis of Quantitative Traits by M. Lynch and B. Walsh, indices 
appear without explanation or apology. 


A colleague told me to mention that indices are crucial in computer programming, something that many 
readers can relate to. 


Toy/Universe, p. xxix. 


Conservation Is Good 


An integrability condition 


Conservation has been important to physics from day one.! In this chapter, we discuss the 
origin of various conservation laws in Newtonian mechanics. 
The most important case is when the force F' depends only on x and can be written in 


the form 
Poa (1) 
ax! 
fori =1,2,---, D. As weall learned, V(x) is called the potential. 


Suppose such a function V(x) exists; then a clever person might have the insight to 
multiply each of Newton’s equations 


2 i 
mot — pia 8V@) 0) 
dt? ox! 
by a to obtain the D equations 
d*x! dx! dV (x) dx! bt 
oe a ape °) 


He or she would then recognize that the sum of these D equations could be written as 


d {1 dxi\* 
5 Fo» (=) ee vi) =0 (4) 


i 


which we could verify by explicit differentiation. Lo and behold, the total energy, defined by 


E= nye (““y + V(x) (5) 
~ 2 Ndi 


is conserved. It does not change in time. 
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For D = 1, (1) holds automatically: V(x) is simply given by — [* dx’ F(x’). For D > 1, 


the D equations in (1), namely F'(x) = — are imply the consistency or integrability 
condition 
aFi(x) dF!(x) 
en ; (6) 
x Ox 
2 i 
(Since derivatives commute, both sides of (6) are equal to — ere .) Thus, given F'(x), we 


merely have to check to see whether (6) holds. If not, then V does not exist. If yes, then we 
could integrate F(x) = ae for each i to determine V. 


Apples do not fall down 


1 
Suppose V(r) depends only on r = (He) * In other words, the potential does not 
pick out any preferred direction. We take this for granted nowadays, but it represents one 
of the most astonishing insights of physics. Newton realized that the apple did not fall 
down, but toward the center of the earth. 
Differentiating r? = pare )?, we obtain rdr = >>; x'dx! (an “identity,” which we will 
use again and again in this text) or 24 = = so that 
aie xX 


; aFi 1 
F'=——v'(r) and ) =—- 
r axJ r 


ijy! git ’ ” 
[Sey lee a @) re @))] 
r 
which is manifestly symmetric under i < j. 
Here we have introduced the Kronecker delta 6’/, defined by 
b§=1ifk=j, SY =0 ifkFj (7) 
(which we can think of as an ancestor of the Dirac delta function? introduced in chapter I.1). 


oe , but that itis a 
xJ 


linear combination of 5’ and x'x/. We haven't talked about tensors yet (see chapter 1.4), 


The important point is not the somewhat involved expression for 


but this result could have been anticipated by a “what else can it be?” type of argument. 
Not having any preferred direction, we could only construct an object with indices i and 
j out of 5 and x!x/. We could have seen immediately that the integrability condition (6) 
holds. 

Note that this discussion holds for any value of D. 


Conservation of angular momentum 


Suppose the force in (2) points toward the center, so that it has the form F! = f(r)x! 
(with f(r) = —V'(r)/r, as we just saw). Then we obtain angular momentum conservation 
immediately. To see this, multiply Newton’s equation (2) 

d*x! 


n= f(r)x! (8) 
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by x/, so that m eal = f(r)x'x/. Subtract from this the same equation but with i and j 
interchanged. Regardless of the function f(r), we find 


d?x! ,d?x4 
x/ —x'—— =0 9 
dt? dt2 0) 
But this is the same as 
-dx! dxd 
d (“4 idk Jo (10) 
dt dt dt 


Clever, eh? I am constantly amazed by how brilliant early physicists were. 


‘ a sy eg, : : 
The quantity // = (x! aS xi ee ), the angular momentum per unit mass, is con- 


served. Recall that in the preceding chapter, this fact seemingly fell out when we changed 
to polar coordinates. Note also that the argument given here holds for any D > 2. 


Exercise 


1 Let N particles interact according to 


dx! aV(x) 
m f= . 11 
“ dt2 axi oD 
with a=1,---, N. Suppose V(x,,---, xy) depends only on the differences xt = xh, witha, b=1,---,N. 


ax! 
Show that the total momentum )°,, m, een is conserved. 


Notes 


1. Fearful. 

2. I once explained this point to humanists using Einstein’s terminology by saying that “The words up and 
down have no place in the Mind of the Creator.” See A. Zee, New Lit. Hist. 23 (1992), pp. 815-838. See also 
web.physics.ucsb.edu/jatila/supplements/zee lecture.pdf. 


3. In the sense that 5(x — y) is zero for x # y. 
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Rotation in the plane 


My pedagogical strategy for this chapter is to take something you know extremely* well, 
namely rotations in the plane, present itin a way possibly unfamiliar to you, and go through 
it slowly in great detail, “beating the subject to death,” so to speak. 

I have already mentioned that Monsieur Descartes had the clever idea of reducing 
geometry to algebra. Put down Cartesian coordinate axes so that a point P is labeled by two 
real numbers (x, y). Suppose another observer (call him Mr. Prime) puts down coordinate 
axes rotated by angle @ with respect to the axes put down by the first observer (call her 
Ms. Unprime) but sharing the same origin O. Elementary trigonometry tells us that the 
coordinates (x, y) and (x’, y’) assigned by the two observers to the same point P are related 
by? (see figure 1) 


x’=cosOx+sindy, y=—sinOx+coséy (1) 


The distance from P to the origin O of course has to be the same for the two observers. 


According to Pythagoras, this requires \/x/2 + y’2 = /x2 + y2, which you can check us- 
ing (1). 
Introduce the column vectors 7 = ( : ) and r’ = ( : ) and the rotation matrix 


cos@ sin@ 
RO=( (2) 
—sin@ cosé 


so that we can write (1) more compactly as 7’ = R(@)r. 


* If you don’t know rotations in the plane extremely well, then perhaps you are not ready for this book. A 
nodding familiarity with matrices and linear algebra is among the prerequisites. 
t For example, by comparing similar triangles in the figure, we obtain x’ = (x/ cos @) + (y — x tan @) sin 0. 
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Figure 1 The same point P is labeled by (x, y) 


and (x’, y’), depending on the observer’s frame of 
reference. 


As you recall from a course on mechanics, we can either envisage rotating the physical 
body we are studying or rotating the observer. We will consistently rotate the observer. 

We have already used the word “vector.” A vector is a physical quantity (for example the 
velocity of a particle in the plane) consisting of two real numbers, so that if Ms. Unprime 


1 
represents it by p= ( ; 2 ), then Mr. Prime will represent it by p’ = R(@)p. In short, a 


vector is something that transforms like the coordinates ( és ) under rotation. 


e 1 7 1 a 
Given two vectors p = ( “ ) andg = ( ee ), the scalar or dot product is defined by p? - 


g = p'q'+ p*q?. Here T stands for transpose and p’ the row vector (p!, p”). By definition, 
rotations leave p? = p’ - p = (p!)* + (p*) invariant. In other words, if p’ = R(@)p, then 
p? = p’. Since this works for any vector p, including the case in which p happens to be 
the sum of two vectors p =u + v, and since p* = (a+ v)* =u? + v2 + 217 - J, rotation 
also leaves the dot product between two arbitrary vectors invariant: the invariance of p* 
implies that a’? - vo! =i" -v. 

Since u’ = Ru (to unclutter things, we often suppress the @ dependence in R(@)) and so 
iT = i? R™, wenowhave i? -§ =i7 « v! = (G7 R*) - (RD) =H - (R™ RY. (The transpose 
M! ofa matrix M is of course obtained by interchanging the rows and columns of M.) As 
this holds for any two vectors u and v, we must have the matrix equation 


R’R=I1 (3) 


where, as usual, J denotes the identity or unit matrix: J = ( ; : ). Indeed, we could verify 
(3) explicitly: 


cosO0 —sin@ cos@ sind 1 0 
R(0)? R(6) = = (4) 
sin@ cosé —sin@ cosd 0 1 


Matrices that satisfy (3) are called orthogonal. 
Taking the determinant of (3), we obtain (det R)* = 1, that is, det R = +1. The determi- 
nant of an orthogonal matrix may be —1as well as +1. In other words, orthogonal matrices 


‘ ; : 1 0 ; ; 
also include reflection matrices, such as P = ( ae ) a reflection in the y-axis. 
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To focus on rotations, let us exclude reflections by imposing the condition (since 
det P = —1) 


det R=1 (5) 


Matrices with unit determinant are called special. 

We define a rotation as a matrix that is both orthogonal and special, that is, a matrix that 
satisfies both (3) and (5). Thus, the rotation group of the plane consists of the set of all 
special orthogonal 2 by 2 matrices and is known as SO(2). 

Note that matrices of the form PR for any rotation R are also excluded by (5), since 
det(PR) = det P det R = (—1)(4+1) = —1. In particular, a reflection in the x-axis 


( 7 oe which is the product of P and a rotation through 90°, is also excluded. 


Act a little bit at a time 


The Norwegian physicist Marius Sophus Lie (1842-1899) had the almost childishly obvious 
but brilliant idea that to rotate through, say, 29°, you could just as well rotate through a 
zillionth of a degree and repeat the process 29 zillion times. To study rotations, it suffices 
to study rotation through infinitesimal angles. Shades of Newton and Leibniz! A rotation 
through a finite angle could always be obtained by performing infinitesimal rotations 
repeatedly. As is typical with many profound statements in physics and mathematics, Lie’s 
idea is astonishingly simple. Replace the proverb “Never put off until tomorrow what you 
have to do today” by “Do what you have to do a little bit at a time.” 

When the angle is small enough, the rotation is almost the identity, that is, no rotation 
at all. Thus, we can write 


R(@0)~I+A (6) 


where A denotes some infinitesimal matrix. 
Now suppose we have never seen (2). Indeed, suppose we have never even heard of 
sine and cosine. Instead, let us define rotations as the set of linear transformations on 


T .% invariant. Following Lie, we 


2-component objects u’ = Rii and v’ = R2 that leave ii 
solve this condition on R, namely (3) R’ R = I, by considering an infinitesimal transfor- 
mation R(@) ~ I + A. Since by assumption, A? can be neglected relative to A, R7?R~ 
(I+ A’) +A)x (+ A? + A) =I. We thus obtain A? = —A, namely that A must be 


antisymmetric. But there is basically only one 2-by-2 antisymmetric matrix: 


4 : 
een ”) 


In other words, the solution of A’ = —A is A=6J for some real number 6. Thus, 
rotations close to the identity have the form R= 1 +67 + 0(67) = ( - f ) + O(6). The 
antisymmetric matrix 7 is known as the generator of the rotation group. 

An equivalent way of saying this is that for infinitesimal 6, the transformation x’ ~ 
x + 6y and y’ ~ y — 6x (you could verify that (1) indeed reduces to this to leading order in 
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6) obviously satisfies the Pythagorean condition x’? + y? = x? + y* to first order in 6. Or, 
write x’ =x+6x, y’=y-+ dy and solve x6x + yoy =0. 

Alternatively, simply draw figure 1 for 6 infinitesimal. Since we know the transformation 
is linear, we could determine the matrix R in (6) by looking at the figure to see what happens 
to the two points (x = 1, y=0) and (x =0, y = 1) under an infinitesimal rotation. 

Now recall the identity e* = limy_,.(1 + 37) (which you can easily prove by differen- 
tiating both sides). Then, for a finite (that is, not infinitesimal) angle 6, we have 


N N 
R(6) = lim e(2) = lim (14 2) =F (8) 


The first equality represents Lie’s profound idea. For the last equality, we use the identity 
just mentioned, which amounts to the definition of the exponential. 

Some readers may not be familiar with the exponential of a matrix. Given a well-behaved 
function f with a power series expansion, we can define f(M) for an arbitrary matrix 
M using that power series. For example, define e” = °° , M"/n!; since we know how 
to multiply and add matrices, this series makes perfect sense. (Whether or not any given 
series converges is of course another issue.) We must be careful, however, in using various 
identities that may or may not generalize. For example, the identity e“e¢ = e“ for aa real 
number, which we could prove by applying the binomial theorem to the product of two 
series (square of a series in this case) generalizes immediately. Thus, e“@e” = e?”. But for 
two matrices M, and M, that do not commute with each other, e@ie”2 4 eMit™2, 

This provides an alternative but of course equivalent path to our result. To leading order, 


N 
Finally, we easily check that the formula R(9) = e®7 reproduces (2) for any value of 0. 


OF N 
we have every right to write R (4) =1+ %% ~eW and thus R(0) = R (4) = 0S, 


We simply note that .7* = —/ and separate the exponential series into even and odd terms. 
Thus 
lo) CO CO 
oF =>" or 7" jn\= (>: oto) I+ (>: (—1ho7*1/(2k + 0) mk 
n=0 k=0 k=0 
1 0 0 1 cos@ sind 
=cosé1+sin6 J =cosé + sin = (9) 
0 1 —1 0 —sin@ cosé 


which is precisely R(@) as given in (2). Note this works because 7 plays the same role as 
i in the identity e® = cos@ +isin@. 
Poor Lie, he never made it into the 20th century. 


Two approaches to rotation 


Notice that I actually gave you two different approaches to rotation. Let us summarize the 
two approaches. In the first approach, applying trigonometry to figure 1, we write down (1) 
and hence (2). In the second approach, we specify what is to be left invariant by rotations 
and hence define rotations by the condition (3) that rotations must satisfy. Lie then tells 
us that it suffices to solve (3) for infinitesimal rotations. We could then build up rotations 
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through finite angles by multiplying infinitesimal rotations together, thus also arriving 
at (2). 

It might seem that the first approach is much more direct. One writes down (2) and that 
is that. The second approach appears more roundabout and involves some “fancy math.” 
It might even provoke an adherent of the first “more macho” approach to wisecrack, “Why, 
with a bit of higher education, sine and cosine are not good enough for you any more? You 
have to go around doing fancy math!” The point is that the second approach generalizes 
to higher dimensional spaces (and to other situations) much more readily than the first 
approach does, as we will see presently. Dear reader, in going through life, you would be 
well advised to always separate fancy but useful math from fancy but useless math. 

Before we go on, let us take care of one technical detail. We assumed that Mr. Prime and 
Ms. Unprime set up their coordinate systems to share the same origin O. We now show 
that this condition is unnecessary if we consider two points P and Q (rather than one point, 
as in our discussion above) and study how the vector connecting P to Q transforms. 

Let Ms. Unprime assign the coordinates rp = (x, y) and rg = (%, J) to P and Q, respec- 
tively. Then Mr. Prime’s coordinates 7, = (x’, y’) for P and rQ = (x’, y’) for Q are then 
given by r, = R(@)rp and rQ = R(@)rg. Subtracting the first equation from the second and 
defining Ax = x — x, Ay = y — y, and the corresponding primed quantities, we obtain 


Ax’ cos? —sind Ax 
= (10) 

Ay’ sin@ cosé Ay 
Rotations leave the distance between the points P and Q unchanged: (Ax’)? + (Ay’)* = 
(Ax)? + (Ay)*. You recognize of course that this is a lot of tedious verbiage stating the 
perfectly obvious, but I want to be precise here. Of course, the distance between any two 


points is left unchanged by rotations. (This also means that the distance between P and 
the origin is left unchanged by rotations; ditto for the distance between Q and the origin.) 


Invariance and geometry 


There is no royal road to geometry. 
—Euclid’s advice to a prince 


Let no one unversed in geometry enter here. 


—Plato’s motto, carved over the 
entrance to his academy 


Let us take the two points P and Q to be infinitesimally close to each other and replace 
the differences Ax’, Ax, and so forth by differentials dx’, dx, and so forth. Indeed, 
2-dimensional Euclidean space is defined by the distance squared between two nearby 
points: 


ds? = dx? + dy? (11) 
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Rotations are defined as linear transformations* (x, y) > (x’, y’), such that 
dx? + dy* = dx? + dy” (12) 


The whole point is that this now makes no reference to the origin O (and whether Mr. 
Prime and Ms. Unprime even share the same origin). 

The column dx = (Ce 2) = ( a ) is defined as the basic or ur-vector, the template for 
all other vectors. To repeat, a vector is defined as something that transforms like dx under 
rotations. 

So, a vector is defined by how it transforms. An array of two numbers p = ( : >) isa 
vector if it transforms according to p’ = R(6)p. 

Sometimes it is very helpful, in order to understand what something is, to be given an 
example of something that is not. As a simple example, given a p, then ( a : ) is definitely 
not a vector if a # b. (You could easily write down more outrageous examples, such as 
( (pp? 

(p')3+(p?)3 
array of numbers is not a vector unless it transforms in the right way.! 


That ain’t no vector!) You will work out further examples in exercise 1. An 


Oh, about the advice Euclid gave to the prince who wanted to know a quick way of 
mastering geometry. Mr. E is also telling you that, to master the material covered in this 
book, there is no way other than to cogitate over the material until you get it and to work 
through as many exercises as possible. 


From the plane to higher dimensional space 


The reader who has wrestled with Euler angles in a mechanics course knows that the 
analog of (2) for 3-dimensional space is already quite a mess. In contrast, Lie’s approach 
allows us, as mentioned above, to immediately jump to D-dimensional Euclidean space, 
defined by specifying the distance squared between two nearby points (compare this with 
(11)), as given by the obvious generalization of Pythagoras’ theorem: 


D 
ds? = d(ax')’ = (ax")’ +4 (ax*)’ ish (ax?) (13) 
i=1 

This is as good a place as any to say a word about indices. As I said in chapter I[.1, in 
my experience teaching, there are always a couple of students confounded by indices. 
Dear reader, if you are not, you could simply laugh and skip to the next paragraph. 
Indices provide a marvelous notational device to save us from having to give names to 
individual elements belonging to a set. (For example, consider all humans h' now alive, 
with i = 1, 2,---, P where P denotes the population size.) Take a look at the 19th century 
physics literature, before the use of indices became widespread. I am always amazed by 


* Indeed, most, but notall, of the readers” of this book are constantly rotating between two coordinate systems. 
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the fact that, for example, Maxwell could see through the morass of the electromagnetic 
equations written out component by component. 

Rotations are defined as linear transformations dx’ = Rdx that leave ds unchanged. The 
preceding discussion allows us to write this condition as R’ R = I. As before, we want 
to focus on rotations by imposing the additional condition det R = 1. The set of D-by-D 
matrices R that satisfy these two conditions forms the simple orthogonal group SO(D), 
which is just a fancy way of saying the rotation group in D-dimensional space. 


Lie in higher dimensions 


The power of Lie now shines through when we want to work out rotations in higher 
dimensional spaces. All we have to do is satisfy the two conditions R’ R = J and det R = 1. 

So let us follow Lie and write R ~ J + A. Then R? R = 1 is solved by requiring A = — A’, 
namely that A must be antisymmetric. But it is very easy to write down all possible 
antisymmetric D-by-D matrices! For D = 2, there is basically only one: the 7 introduced 
earlier. For D = 3, there are basically three of them: 


0 0 0 G:.0.=1 0 10 
BESO Oe A Ns - GEO: Oe |. SS 0 (14) 
0 -1 0 10 0 0 00 


Any 3-by-3 antisymmetric matrix can be written as A = 6,7, + 0,J, + 0,J,, with three 
real numbers 6,, 6,, and 6,. At this point, you can verify that R ~ J + A, with A as given 
here, satisfies the condition det R = 1. 

The three matrices 7,, J,, J, are known as the generators of the 3-dimensional rotation 
group SO(3). They generate rotations, but are of course not to be confused with rotations, 
which are by definition 3-by-3 orthogonal matrices with determinant equal to 1. 

The upshot of this whole discussion is that any 3-dimensional rotation (not necessarily 
infinitesimal) can be written as R(@) = e4 and is thus characterized by three real numbers. 
As I said, those readers who have suffered through the rotation of a rigid body in a course 
on mechanics must appreciate the simplicity of studying the generators of infinitesimal 
rotations and then simply exponentiating them. 


Index notation and rotations 


Some readers will find this obvious, but others might find it helpful if we derive the 
condition R? R = I explicitly once again using the index notation. I prefer to go slow here, 
since we will need some of the same formalism later when we get to special relativity. Once 
the reader feels sure-footed, we could then dispense with indices. 

Let me start by reminding the reader that a D-by-D matrix M carries two indices and has 
entries M'/, with the standard convention that the first index labels the rows, the second 


the column (fori, j =1,2,---, D). For example, for D = 2, M = (Mas we) and M!? is 
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the entry in the first row and the second column, whereas M*! is the entry in the second 
row and the first column. Note that the transpose of a matrix M is given by (M‘)/! = M4. 
Thus, if v is a column vector with entries v/, then the entries of the column vector u = Mv 
are given by u! = iM ‘JyJ, For A and B two D-by-D matrices, the product AB is defined 
as the matrix with the entries (AB)'/ = >, A’* BY), (If everything here is news to you, see 
the first footnote in this chapter.) 

Under a rotation, 


dx" => Riidxd = RUdx' + R?dx? + +--+ RiPdxP (15) 
j 


(I have written the sum out explicitly for the benefit of the rare reader afflicted by fear 
of indices.) Also, as was mentioned in chapter I.1, at this stage it is completely arbitrary 
whether we write upper or lower indices. 

Let us pause and recall the Kronecker delta symbol 5’/ introduced in (1.2.7), defined 
to be equal to +1 if i = j and 0 otherwise, and which we can also think of as a D-by-D 
unit matrix. We will be encountering the highly useful Kronecker delta often in this book. 
For example, )), A/B/ = )), )0; 6“/ A B/. Since 6“/ vanishes unless k is equal to j, the 
double sum on the right hand side collapses to the single sum on the left hand side. In 
other words, the Kronecker delta allows us to write a single sum as a double sum. It seems 
like a really silly thing to do, but as we will see presently, it is an extremely useful trick that 
we use quite often in this book. 

We now determine how the matrix R must be restricted for it to be a rotation. The 
statement that ds? = Ved x!) as defined in (13) is left unchanged by the rotation implies 
that (with all indices running over 1, --- , D) 


Yitax'? = y 2 = Rikdx* Riidxi = Six) = > +2 dM dxkdx! (16) 
i ik j j ko] 


In the last step, we used what we just learned. 
Since the infinitesimals dx! can take on arbitrary values, to have the second term equal 
to the last term in (16), we must equate the coefficients of dx*dx/ and demand that 


x RikRi = shi = SiR?) Ri = (R" Ry‘ (17) 
i i 
Indeed, we obtain R’ R = I just as in (3), but now in D-dimensional space for any D. 

We end this section with a trivial remark. So far in this chapter, we have written the 
column vectors as columns. But columns take up so much space, and so for typographical 
convenience (editors must be placated!) we will henceforth write the entries of a column 
vector as dx = (dx', dx, ---, dx), a practice we will indulge in throughout this book. 
(If we want to be insufferably pedantic, we could put in a T for transpose: the column 
ur-vector dx = (dx', dx?, --»,dx?)".) 
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Einstein’s repeated index summation 


Observe that in all those sums in (16) the indices to be summed over always ap- 
pear twice, that is, they are repeated. For example, in the second term in (16), 
Yi De DU Rd x* RV dx/, the indices i, k, and j all appear repeated. Thus, we could adopt 
the so-called repeated index summation convention proposed by Albert Einstein himself: 
omit the pesky summation symbol and agree that if an index is repeated, then it is to be 
summed over. For example, dx" = er R‘Jdx/ can now be written as dx" = R'/dx/: in the 
expression on the right hand side, the index j appears twice and is thus to be summed 
over.* In contrast, i is a “free” index and does not appear twice in the same expression. 
Notice that free indices must match on opposite sides of any equation. It is rightly said 
that one of Einstein’s greatest contribution to physics is the repeated index summation 
convention.’ When we get to Einstein gravity, we will meet lots of indices to be summed 
over, and it would be silly to keep on writing the summation symbol. 


Vector fields 


The vectors we encounter may well vary in space. For example, the flow velocity in a fluid 
in general would depend on where we are. We are then dealing with a vector field V(x). 
Again, consider two observers studying the same vector field. Mr. Prime would see 


V'(%) = RV(X) (18) 


with x’ = Rx of course. In other words, the two observers are studying the same vector 
field at the same point P. See figure 2. As another example, the familiar electric E(x) and 
magnetic fields B(&) are both vector fields. 


Physics should not depend on the observer 


Let me stress again why physicists constantly talk about vectors. The laws of physics often 
involve the statement that one vector is equal to another, for example, Newton’s law states 
ma = F. Applying a rotation matrix R(@), we obtain mR(0)a = R(O)F _ If F transforms 
like a vector, then md’ = F’. Ms. Unprime and Mr. Prime see the same Newton’s law, and 
more generally, the same laws of physics! 

This statement, while self-evident, is profound, and in some sense, it is what makes 
physics possible. Physics should not depend on the physicist. Ms. Unprime and Mr. Prime 


* When a pair of repeated indices, such as j here, is summed over, they are often said to be contracted with 
each other. In a tiny abuse of terminology, people also say that R’/ is contracted with dx/. 

T It appeared only in his later work. In 1905, Einstein did not even use vector notation! In one system, the 
coordinates were denoted by x, y, z, in the other, by €, 7, ¢; the components of the force acting on the electron 
were called X, Y, Z. To modern eyes, his notation was a horrific mess. 
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Figure 2 Two observers studying the same vector field. 


see different accelerations d and @’, and different forces F and F’, but the same Newton’s 
law. We say that Newton’s law is invariant—that is, it does not change—under rotation.” 
We should also remind ourselves that mass is an example of a scalar: a physical quantity 
that does not change under rotation. If it does change, Newton’s law would not be invariant 
under rotation and one observer would be preferred over another, which is unacceptable. 
Physics rests on the democratic ideal. 
Let me remind you that the gravitational force in the planetary problem studied in 


chapter I.1 is derived from what is sometimes called a central potential, namely one without 
; 

axt 

so a fortiori transforms like a vector. 


a preferred direction: F'(x) = —-4,V(r) = -<V' (r). Hence, F is proportional to x and 

At this point, it may be worthwhile to be a bit more pedantic and professorial. Some 
authors give long-winded speeches about covariance versus invariance, and take great pain 
to distinguish the two. We should too. The equation ma = F is covariant, that is, the two 
sides transform the same way under rotations. The physics expressed by Newton’s second 
law is, however, invariant, that is, independent of observers related by a rotation. If physics 
depends on how you tilt your head, we are in trouble. Physics does not, but the way physics 
is expressed, in terms of equations, does. 

Here is the profound and trivial statement. Under a certain set of transformations, a 
purportedly fundamental equation is said to be covariant if the two sides of the equation 
transform in the same way. If so, then that transformation is known as a symmetry of 
physics.’ Physics is said to be invariant under that transformation. As we will see, both sides 
of Einstein’s field equation transform in the same way, as tensors, under what are known 
as general coordinate transformations. I will explain what a tensor is in the next chapter. I 
will allow myself the luxury of using the words invariance and covariance interchangeably 
and simply trust you to be discerning. 

Since we can always move the quantity on the right hand side of an equation to the 
left hand side, we can rewrite a physical law of the form uv =v in the form W=u—-—vV= 
0. Physics students sometimes joke that they could already write down the ultimate 


* The reader who has already been exposed to the special theory of relativity knows that this notion of invariance 
represents the essence of Einstein’s insight. We will of course have a great deal more to say about that! 
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equation of physics, namely V =0, whatever ¥ is. Thus, the statement of invariance 
merely expresses the mathematically obvious fact that if w = 0, then R(@)w = 0. (Strictly 
speaking, the 0 on the right hand side should be written as 0, but we don’t want to be that 
pedantic!) 


Descartes versus Euclid 


I remember how excited I was when I learned about analytic geometry. Surely you were 
excited too. What a genius, that Descartes! Henceforth, we could prove geometric theorems 
by doing algebra. After Descartes,* physics can no longer live without the concept of 
coordinates,* but he also managed to obscure what was once obvious to Euclid. We now 
must also insist on invariance. Indeed, the notion of invariance is at the heart of what we 
mean by geometry. 

For example, suppose somebody hands you a formula for the area of a triangle with 
vertices at (a), b), (ay, b), (ay, 3). You better insist that the formula is invariant under 
rotation. In fact, this requirement, plus the requirement that the area should scale as the 
square of the separation between the three vertices, suffices to determine the formula. 
This simple example rings in the central motif of this book. 


Appendix 1: Differential operators rather than matrices 


Here I have to divide readers into the haves and the have-nots, but only temporarily. What I will say may sound 
difficult, but really, it amounts to not much more than a notational triviality. 

If you have studied quantum mechanics, you would know that the generators 7 of rotation studied here 
are related to angular momentum operators. You would also know that in quantum mechanics, observables are 
represented by hermitean operators. However, in our discussion, the 7s come out naturally as antisymmetric 
matrices and are thus antihermitean. To make them hermitean, we multiply them by some multiples of i. 

If you have not studied quantum mechanics, then the preceding would sound like gibberish to you, but do 
not worry. Simply take the attitude that, hey, it is a free country, and we can always invite ourselves to define a 
new set of physical quantities by multiplying an existing set of physical quantities by some constant. Heck, we 
could multiply by 17: if we want. 

Even though here we are nowhere near quantum mechanics, we will bow to customary usage and define J, = 
—i J, and so forth. From (14) we see that, for example, J, acting on ae colin vector (x, y, z) Bes ity, —x,0). 
Thus, instead of using matrices, we could also represent J, by iy2t—-—xe dy 2) since J,x = ive — xa a )y = iy, 


Jy =i(y2 ag x ayy = —ix, and J,y= ive - xe)z = 0. Note that J, is precisely the z-component of the 


Ox 


angular momentum operators in quantum mechanics. We can naturally pass back and forth between matrices 
and differential operators. We will not make use of this differential representation until a later chapter. 


* Regarding the argument (which I mentioned in a footnote in the preface) between those who live with 
coordinates and those who live coordinate free, I would say that the proof of angular momentum conservation, 
which I already gave, not once, but twice in the two preceding chapters using coordinates, provides an example in 
favor of the latter group: 4] = 4 (F x p)=m 4 (F x dr )=m ae x = + mF x a = 0 for rotationally symmetric 
potentials. While this indeed looks simpler than the two previous discussions, the former group could also say 
that this requires learning “considerable formal math,” such as the cross product and its various properties. 
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Appendix 2: Rotations in higher dimensional space 


Here we discuss rotations in D-dimensional Euclidean space. As you have no doubt heard, Einstein combined 
space and time into a 4-dimensional spacetime. Thus, what you will learn here about SO (4) will be put to good 
use.* If you prefer, you could skip this discussion and come back to it later. 

Start with a D-by-D matrix with 0 everywhere. Generalize (14). Stick a 1 into the mth row and nth column, 
and a (—1) into the nth row and mth column. Call this matrix J(,,,. We put the subscripts (mn) in parentheses to 
emphasize that (mn) labels the matrix. They are not indices to tell us which element of the matrix we are talking 
about. As explained before, we define Ji.) = —i Jinn) $0 that explicitly 

Je 


(mn) = 


—i (675) — 65") (19) 


To repeat, in the symbol J ik ny the indices i and j indicate respectively the row and column of the entry J, : is wn OF 
the matrix J(,), while the indices m and n, which I put in parentheses for pedagogical clarity, indicate which 
matrix we are talking about. The first index m on J(,,,) can take on D values, and then the second index n can take 
on only (D — 1) values since, obviously, J(nm) = 0. Also, since J(ym) = —J(mny, We require m > n to avoid double 
counting. Thus, there are only 5D(D — 1) real antisymmetric D-by-D matrices J(,,,), and A could be written as 
a linear combination of them: A =i Yoon Onn (mny» Where 6,,, denote 5D(D — 1) real numbers. (As a check, 
for D =2 and 3, } D(D — 1) equals 1 and 3, respectively.) The matrices Jimny ate known as the generators of the 
group SO(D). 

Notice a notational peculiarity: for SO (3), the Js could be labeled with one index rather than two indices. The 
reason is simple. In this case, the indices m, n take on 3 values, and so we could write J, = Jo3, J, = J31, and 
J, = J42. We will, as we do here, often pass freely between the index sets (123) and (xyz). In general, rotations 
are labeled by the plane they occur in, say the (m-n) plane spanned by the mth and nth axes. In 3-dimensional 
space, and only in 3-dimensional space, a plane is uniquely specified by the vector perpendicular to it. Thus, a 
rotation commonly spoken of as a rotation around the z-axis is better thought of as a rotation in the (1-2) plane, 
that is, the (x-y) plane. (In this connection, note that the 7 in (7) appears as the upper left 2-by-2 block in 7, in 
(14).) In contrast, for SO (4) it makes no sense to speak of a rotation around, say, the third axis. 

The reader who has studied some group theory knows that the essence of the group is captured by the extent 
to which the multiplication of two group elements does not commute. For rotations, everyday observations show 
that R(@)R(6’) is in general quite different from R(6’) R(@). See figure 3. 

Following Lie, we could try to capture this essence by focusing on infinitesimal rotations. Let Rj; ~ 1+ A 
and Ry ~ 1+ B. Then R}Ry~ (J + AVI + B) YI +A+B+4 AB + O(A?, B?) (where rather pedantically we 
have indicated that to the desired order if we keep AB, we should also keep terms of order O(A?, B2), but we 
will see immediately that they are irrelevant). If we multiply in the other order, we simply interchange A and 
B, thus R,R;~ (1 + AVI + B)~I1+B+A+ BA+ O(A?, B’). Hence, R,Ry and R,R, differ by the amount 
[A, B] = AB — BA, a quantity known as the commutator between A and B. 

More formally, given two matrices X and Y, to measure how they differ from each other, we could ask how 
X~—1Y differs from the identity. If X = Y, then this product is equal to the identity. Now, the inverse of a matrix 
I + A infinitesimally close to the identity is easy to determine: it is just J — A, since (I — A)(I + A) =1 + O(A2). 
Thus, let us calculate (RR 1)~!RRo: 


(R)R}) 1 RR) = [I — (B+ A+ BA+ O(A?, B?)) [1 + A+ B+ AB+ O(A’, B)] 
=I+[A, B]+--- (20) 


For SO (3), for example, A is a linear combination of the J;s, known as the generators of the Lie algebra. Thus, 
we could write A =i )7; J; and similarly B =i )); 0; J;. Hence [A, B]= i? Diy %%[Ji, Jj], and so it suffices 
to calculate the commutators [J;, Jj]. 

Recall that for two matrices M, and M), (M,M))" = Mj M7. Transpose reverses the order. Thus ({J;, J;])7 = 
—[Jj, Jj]. In other words, the commutator [J;, J;] is itself an antisymmetric 3-by-3 matrix and thus could be 
written as a linear combination of the J,s: 


[is Jil = lige Ie (21) 


* Higher dimensional rotation groups often pop up in the most unlikely places in theoretical physics. For 
example, SO (4) is relevant for a deeper understanding of the spectrum of the hydrogen atom.° 
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(b) 


Figure 3 A marine recruit in a boot camp is standing and facing north. When the drill sergeant 
shouts, “Rotate by 90° eastward around the vertical axis” our recruit turns to face east. Suppose 
the sergeant next shouts, “Rotate by 90° westward around the north-south axis.” Our recruit 
ends up lying down on his back with his head pointing west, his feet pointing east. But what 
would happen if the sergeant reverses his two commands? You could easily verify that our recruit 
now ends up lying down on his left elbow, with his head pointing north. The order matters. For 
this reason, the study of rotations has been a béte noire for generations of physics students. 


for a set of real (convince yourself of this!) numbers c;;,. The summation over k is implied by the repeated index 
summation convention. 
By explicit computation using (14), we find 


[Yx, JH i, (22) 


You should work out the other commutators or argue by cyclic substitution x + y — z > x. The three commu- 
tation relations may be summarized by 


[Jis Ji] = fife te (23) 


We define the totally antisymmetric symbol ¢;;, by saying that it changes sign upon the interchange of any pair 
of indices (and hence it vanishes when any two indices are equal) and by specifying that €,); = 1. In other words, 
we found that ¢;j4 = €ijx- 

Lie’s great insight is that the preceding discussion holds for any group whose elements are labeled by a set of 
continuous parameters (such as 6;, i = 1, 2, 3in the case of SO(3)), groups now known as Lie groups. Expanding 
the group elements around the origin, we arrive at (20) and hence the structure (21) for any continuous group. 
The set of all commutation relations of the form (21) is said to define a Lie algebra, with c;;, referred to as the 
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structure constants of the algebra. The matrices J; are called the generators of the Lie algebra. The idea is that 
by studying the Lie algebra, we go a long way toward understanding the group. 
You should now work out (exercise 4), starting from (19), the Lie algebra for SO(D): 


[Fomny> I(pqy) = EOmp Jing) + 8ng Jompy — Snp Seng) — 8g Jnp)) (24) 


This may look rather involved to the uninitiated, but in fact it is quite simple. First, the right hand side, 
a linear combination of the Js, as required by the general argument above, is completely fixed by the first 
term by noting that the left hand side is antisymmetric under three separate interchanges: m <n, p < q, and 
(mn) < (pq). Next, all those Kronecker deltas just say that if the two sets (mn) and (pq) have no integer in 
common, then the commutator vanishes. If they do have an integer in common, you simply “cross off” that 
integer. This is best explained by using SO(4) as an example. We have [J(12), J(34)] = 9, [Jciz), Jaa] = i(24), 
[Ji23), Yan] = —iJan = iJ), and so forth. The first of these relations says that rotations in the (1-2) plane and 
in the (3-4) plane commute, as you might expect. Do write down a few more and you will get it. 


Exercises 


1 Suppose we are given two vectors p and q in ordinary 3-dimensional space. Consider this array of three 
2,3 


numbers: ( pq! . Prove that it is not a vector, even though it looks like a vector. (Check how it transforms 
pig? 
p’q—prq’ 
under rotation!) In contrast, ( pqi—p'q3 does transform like a vector. It is in fact the vector cross product 
p'q?—p?q' 
PX. 


2 Show that the product of two delta functions 5(x)6(y) is invariant under rotation around the origin. 


3 Using (14) show that a rotation around the x-axis through angle 6, is given by 


1 0 0 
R,(0,)=] 0 cosd@ sind, 
0 -—sin6é, cosé, 


Write down R,(6,). Show explicitly that R,(0,)Ry(0y) # Ry (Oy) Ry Ox). 
4 Calculate [Jonn), J(pqyl- 


5 Given a 3-vector p, show that the quantity p’p/ when averaged over the direction of p is given by 
a [ d0dg cos 6 pi pi = 3 pro. 


Notes 


1. Outside of physics, people often erroneously call any array of numbers a vector. Of course, people are free to 
call anything anything, so let’s not quibble about the word “erroneously.” 

2. I say “most, but not all,” because it is conceivable that you are a native speaker of Guugu Yimithirr. See 
G. Deutscher, Through the Language Glass, H. Holt and Co., 2010, p. 161. 

3. The intellectual precision of our definition of symmetry is necessary lest we make the same mistake as the 
ancient Greeks. See Fearful, pp. 11-12 and figure 2.2. 

4. According to one story, take it or leave it, Descartes was lying in bed when he noticed a fly buzzing around 
the room. He then realized that he could fix the fly’s position given how far the fly was from two intersecting 
walls and the ceiling. 

5. For example, J. J. Sakurai and J. Napolitano, Modern Quantum Mechanics, pp. 265-268. 


P) 


A tensor is something that transforms like a tensor 


Long ago, an undergrad who later became a distinguished condensed matter physicist 
came to me after a class on group theory and asked me, “What exactly is a tensor?” I told 
him that a tensor is something that transforms like a tensor. When I ran into him many 
years later, he regaled me with the following story. At his graduation, his father, perhaps 
still smarting from the hefty sum he had paid to the prestigious private university his son 
attended, asked him what was the most memorable piece of knowledge he acquired during 
his four years in college. He replied, “A tensor is something that transforms like a tensor.” 

But this should not perplex us. A duck is something that quacks like a duck. Mathemati- 
cal objects could also be defined by their behavior. We already saw in the preceding chapter 
that a vector is defined by how it transforms: V" = R'/ V/. Consider a collection of “math- 
ematical entities’ T'/ with i, j = 1,2,---, D in D-dimensional space. If they transform 
under rotations according to 


Til ae Thi = Rk RE pH (1) 


then we say that T transforms like a tensor, and hence is a tensor. (Here we are using the 
Einstein summation convention introduced in the previous chapter: The right hand side 
actually means aa ae _RKRI'T™ and is a sum of D? terms.) Indeed, we see that we 
are just generalizing the transformation law of a vector. 


Fear of tensors 


In my experience teaching, a couple of students are invariably confused by the notion of 
tensors. The very word “tensor” apparently make them tense. Dear reader, if you are not 
one of these unfortunates, so much the better for you! You could zip through this chapter. 
But to allay the nameless fear of the tensorphobe, I will go slow and be specific. 
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Think of the tensor T’/ as a collection of D? mathematical entities that transform 
into linear combinations of one another. To help the reader focus, I will often spe- 
cialize to D = 3. Compounded and intertwined with their fear of tensors, the unfor- 
tunates mentioned above are also unaccountably afraid of indices, as mentioned in 
chapter I.1. For them, let us list T’/ explicitly for D = 3. There are 3*=9 of them: 
TH 7, 73 741, 722, 723, 731, 732, 733, That’s it, 9 objects that transform into linear 
combinations of one another. For example, (1) says that 7’! = R* RYT! = RRM 
RRVp2 4 RRB 4 R2WRllp21 4 R2@Rp22 4 RRB 4 RBRUp31y 
R3RVT? + R3RBT33_ This shows explicitly, as if there were any doubt to begin with, 
that 7’?! is given by a particular linear combination of the 9 objects. That's all: the ten- 
sor T'/ consists of 9 objects that transform into linear combinations of themselves under 
rotations. 

We could generalize further and define* 3-indexed tensors, 4-indexed tensors, and so 
forth by such transformation laws as WJ" = RRR" Wk, Here we will focus on 2- 
indexed tensors, and if we say tensor without any qualifier, we often, but not always, mean 
a 2-indexed tensor. With this definition, we might say that a vector is a 1-indexed tensor 
and a scalar is a 0-indexed tensor, but this usage is not common. A scalar transforms as a 
tensor with no index at all, namely S’ = S; in other words, a scalar does not transform. 


Tensor field 


In the preceding chapter, we introduced the notion of a vector field V' (x), nothing more or 
less than a vector function of position. That itis a vector means that it transforms according 
to V(x’) = R‘i V4 (%). Now consider the derivative of this vector field 2 ae ; 
call W*/ (x). 
2 a eae -1y/ Ty axk Tykh hk 
Use the fact that x’ = Rx implies x = R™'x' = R’ x’ and thus 377, = (R’ )"" = R™. (The 
O in the rotation group SO(D) is crucial: the inverse of a rotation is its transpose.) Then 


which we will 


a ax* a _ pre_9 


= = 2 
ax’? — ax/h axk axk (2) 
Thus 
: "xX! eae OVI (x Pr 5 
w" ¢’) = — ) _ Rik ae (RY V/(x)) ae Ri Ril is _ Ri Ril whi (x) (3) 


Comparing with (1) we see that W“/ (x) transforms like a tensor and, hence, is a tensor. 
Indeed, it is a tensor field. 

Notice that a tensor 7’ transforms as if it were composed of two vectors v'w/, that 
is, T and v'w/ transform in the same way. (Compare v'w/ > v'w'/ = Ri vk Ri! = 
R*RJ'ykw! with (1).) It is important to recognize that only in exceptional cases does a 
tensor T’/ happen to be equal to v'w/ for some v and w. In general, a tensor cannot be 


* Our friend the Jargon Guy tells us that the number of indices carried by a tensor is known as its rank. (The 
Jargon Guy is a new friend of the author; he did not appear in QFT Nut.) 
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written in the form v' w/. Our tensor field W*/ (x) offers a ready example: in general, it is 
not equal to some vector U‘ multiplied by V/ (x). 

Also, note in our example that the differential operator 4 transforms (2) like a vector. 
For example, if ¢'(x’) = @(x) transforms like a scalar, then a transforms like a vector. 
Indeed, that’s why you have encountered the notation V for the gradient in an elementary 
physics course. This remark will be important later when we revisit Newton's inverse 


square law in chapter II.3. Do exercise 1 now. 


Representation theory 


Go back to the 9 objects T’/ that form a tensor. Mentally arrange them in a column 
pul 
712 


T33 


The linear transformation on the 9 objects can then be represented by a 9-by-9 matrix 
D(R) acting on this column. (Here we are going painfully slowly because of common 
confusion on this point. Some authors refer to this column as a 9-component “vector,” 
which is a horrible abuse of terminology. We reserve the word “vector” for something that 
transforms like a vector V" = R'/ V/. Itis not true that any old collection of stuff arranged 
in a column is a vector. Don’t call anything with feathers a duck!) 

For every rotation, specified by a 3-by-3 matrix R, we could thus associate a 9-by-9 matrix 
D(R) transforming the 9 objects T’/ linearly among themselves. We say that the 9-by-9 
matrix D(R) represents the rotation matrix R in the sense that 


D(Ry)D(R2) = D(R1R2) (4) 


Multiplication of D(R,) and D(R2) mirrors the multiplication of R; and Rj, as it were. The 
tensor T is said to furnish a 9-dimensional representation of the rotation group SO(3). 
The 9-by-9 matrices D(R) represent R. Notice that with this jargon, the vector furnishes a 
3-dimensional representation of the rotation group, known as the defining or fundamental 
representation. 


Reducible versus irreducible 


Let us now pose the central question of representation theory. Given these 9 entities T“/ 
that transform into each other, consider the 9 independent linear combinations that we 
can form out of them. Is there a subset among them that only transform into each other? 
A secret in-club, as it were. 

A moment’s thought reveals that there is indeed an in-club. Consider AY = T — T/"” 
Under a rotation, 
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Ali Ati alii — pli = pikpilp el — pik pil pk 
= Rik RII TH _ pilpikplk = pik pil (ph — plky = Rik Ril AM (5) 


I have again gone painfully slow here, but it is obvious, isn’t it? We just verified in (5) that 
A’/ transforms like a tensor and is thus a tensor. Furthermore, this tensor changes sign 
upon interchange of its two indices (A’/ = —A/) and is said to be antisymmetric. The 
transformation law (1) treats the two indices democratically, without favoring one over 
the other, and thus preserves the antisymmetric character of a tensor: if A‘/ = —A/', then 
Ai =—A'}' also. 

Let us count. The index i in A’ could take on D values; for each of these values, 
the index j could take on only D — 1 values (since the D diagonal elements A’ = 0 for 
i=1,2,---, D,no Einstein repeated index summation here); but to avoid double counting 
(since A‘ = —A/") we should divide by 2. Hence, the number of independent components 
in A is equal to 5D(D — 1). For example, for D = 3, we have the 3 objects: A’, A?3, and A>. 
The attentive reader would recall that we did the same counting in the previous chapter. 

Obviously, the same goes for the symmetric combination S/ = T'/ + T/'. You could 
verify as a trivial exercise that S”/ = R'*R/'S*", A tensor S'/ that does not change sign 
upon interchange of its two indices (S'/ = S/’) is said to be symmetric. Evidently, the sym- 
metric tensor S has more components than the antisymmetric tensor A. In addition to the 
components S‘/ withi 4 j, S also has D diagonal components, namely S$", $*7,---, §??. 
Thus, the number of independent components in S is equal to ;D(D —-1)+D= 
5D(D +1). 

For D = 3, the number of components in A and S are 5 -3-2=3 and 5 -3-4=6, 
respectively. (For D = 4, the number of components in A and S are 6 and 10, respectively.) 
Thus, in a suitable basis, the 9-by-9 matrix referred to above actually breaks up into a 3- 
by-3 block and a 6-by-6 block. We say that the 9-dimensional representation is reducible: 
it could be reduced to smaller representations. 

But we are not done yet. The 6-dimensional representation is also reducible. To see this, 
note 


sti = Rik pil gk — (RT )ki pil gkl — (R-lyki pil gkl — ghl gkl _ gk (6) 


where we have used the O in SO(D). (Here we are using repeated index summation: 
the indices i and k are both summed over.) In other words, the linear combination 
SM 4 §224...4 §?? the trace of S, transforms into itself, that is, does not transform 
at all. It is a loner forming an in-club of one. The 6-by-6 matrix describing the linear 
transformation of the 6 objects S‘/ breaks up into a 1-by-1 block and a 5-by-5 block. See 
figure 1. 

Again, for the sake of the beginning student, let us work out explicitly the 5 objects that 
furnish the representation 5 of SO (3). First define a traceless symmetric tensor S by 


SU = si — sii (sk*/D) (7) 


(The repeated index k is summed over.) Explicitly, S'’ = Ss’ — D(S**/D) =0, and S is 
traceless. Specialize to D = 3. Now we have only 5 objects, namely $1!, $22, $12, $13, $23, 
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9—-54+3+1 


Figure 1 How the collection of 9 objects T'/ splits up. The figure is meant 
to be schematic: the dots do not represent the original 9 objects, but linear 
combinations of them, and the positions of the dots are not meaningful. 


We do not count $33 separately, since it is equal to —(S'! + $?2). Under an SO (3) rotation, 
these 5 objects transform into linear combinations of one another, as we just explained. 

Let us be specific: the object S¥, for example, transforms into $3 = R! Rs — 
RI RZ1GS11 4 R1LR32512 4 RIL RI3G13 4 R12 R31§21 4 R12 R32G§22 4 R12R33§23 4 R13 R31931 
+ RB R32§32 4 RB3R33§33 — (RM R31 — RBR33) 511 4 (RMR 4 RRB) §12 4 (RNR 4 
RBR345B +4 (R22 R22 — R13 R33) $22 4 (R12 R33 + R13 R32) 523, where in the last equality, 
we used S‘/ = $/‘ and $33 = —($1! + $22), Indeed, $13 transforms into a linear combina- 
Hono SS 8.5, 8 

To summarize, what we found is that if, instead of the basis consisting of the 9 entities 
T‘J, we use the basis consisting of the 3 entities A‘/, the single entity S* (remember 
repeated index summation!), and the 5 entities S/, the 9-by-9 matrix D(R) (that represents 
rotation in the sense of (4)) breaks up into a 3-by-3 matrix, a 1-by-1 matrix, and a 5-by-5 
matrix “stacked on top of each other.” This is represented schematically as 


(3-by-3 block) 0 0 
D(R) = (9-by-9 matrix) > 0 (1-by-1 block) 0 (8) 
0 0 (5-by-5 block) 


Note that once we chose the new basis, this decomposition holds true for all rotations. 
(For the readers who know their linear algebra, the technical statement is that there exists 
a similarity transformation that block-diagonalizes D(R) for all R. Incidentally, we will 
encounter plenty of similarity transformations later.) 

More generally, the D? representation furnished by a general 2-indexed tensor decom- 
poses into a 5D(D — 1)-dimensional representation, a (;D(D + 1) — 1)-dimensional rep- 
resentation, and a 1-dimensional representation. We say that in SO(3),9=5+3+41. (In 
SO(4),16=9+6+4+1,) 

You might have noticed that in this entire discussion we never had to write out R 
explicitly in terms of the 3 rotation angles and how the 5 objects S$", --- , $?3 transform 
into one another in terms of these angles. It is only the counting that matters. You might 
regard that as the difference between mathematics and arithmetic. 
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Figure 2 Under SO(3), the 5 objects inside the solid line transform 

into linear combinations of each other, but under the smaller group of 
transformations SO(2), the objects inside each of the 3 dashed lines 

transform into linear combinations of each other. The 5 breaks up as 

5— 242+1.As in figure 1, this figure is meant to be schematic. 


Restriction to a subgroup 


You definitely do not have to master group theory! to read this book, but it would be useful 
for you to learn a few basic concepts and to be able to count. For instance, the notion of a 
subgroup. Consider the group SO (2) that we studied to exhaustion, consisting of rotations 
around the z-axis, say. Evidently, SO (2) is a subgroup of SO (3) in that its elements are all 
elements of SO (3) and form a group all by themselves. The components of the 3-vector V! 
could be split into two sets: (V!, V*) and V*. Under a rotation around the z-axis, (V!, V7) 
transform as a 2-vector and V? as a scalar. We say that upon restriction to the subgroup 
SO(2), the irreducible representation 3 breaks up into the representations 2 and 1 of the 
subgroup, a decomposition we write as 3 > 2 + 1. All the group theoretic results we need 
in this book could be obtained by explicit listing and simple counting. 

Look at the 5 objects, S1!, $72, S12, $13, 573, that furnish the representation 5 of SO(3). 
Now consider a restriction to the subgroup SO(2). In other words, we restrict ourselves to 
rotations around the z-axis, that is, rotations under which V+ > V’? = V3, namely rotations 
with R??=1and R'3, R*3, R3!, R* all vanishing. Since SO(2) does not touch the index 
3, we conclude immediately that the combination $1! + 522 = —S*3 does not transform, 
or in other words, it transforms as a singlet under SO(2). Similarly, the pair (S 13 523) 
transforms as a doublet, since the index 3 is “invisible” to SO(2): the group transforms 
the indices 1 and 2 into each other, while leaving the index 3 alone. Indeed, we see that our 
earlier expression for S’™ collapses to $’3 = R1!§13 + R'*S?3, as expected. Finally, you can 
verify that the remaining combinations (S17, 5‘! — $?) transform like a doublet. These 
results could be summarized by saying that, upon restriction to the subgroup SO(2), the 
irreducible representation 5 of the group SO(3) breaks up as 5 > 2+ 2+ 1. See figure 2. 


Tensors in Newtonian mechanics 


Let us give another example, particularly apt for a book on gravity, of a Newtonian tensor. 
Consider two nearby particles moving in a potential. Denote their trajectories by x(+) 
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s i Oo 2yi ae 
and y(t), respectively, determined by ax = —d'V(x) and a = —d'V(y). (I am also 
testing whether there are any readers who do not understand thoroughly the concept of 
notational freedom.) We want to know how the separation vector s = y — x changes with 


time, keeping terms to leading order in s: 


es aa ga = ORV OS -01VE FO = VOI 7 Va)» 


The object R/ (x) = 0'0/ V(x) is manifestly a tensor if V(x) is a scalar. For example, verify 
that RY = GM (5/r? — 3x!x/)/r> for the gravitational potential V(x) = -GM/r. Note that 
Ri ig a symmetric traceless tensor. Since R!’ = 0/3‘ V(X) = VV, the tracelessness merely 
reaffirms the fact that the 1/r potential satisfies Laplace’s equation V2V = 0. Also, R is 
manifestly not the product of two vectors, but it transforms as if it were. 

Let us see how rotational covariance works in the equation 

Psi 

dt? 


=—-Risi (9) 


The right hand side has to be linear in the vector s. Since the left hand side transforms like 
a vector, the right hand side must also: indeed, it is given by a tensor R contracted* with 
a vector s. A tensor is needed on the right hand side. 

Imagine yourself falling toward a spherical planet or star. With no loss of generality, 
let your location at some instant be (0, 0, r) along the z-axis. The tensor R written out 
as a matrix is then diagonal and is given by (for example, R*? = GM (537? — 3x3x3)/r? = 
GM (1 — 3)/r?) 


10 0 
GM 
0 0 -2 


Thus, the sign of ds depends on the orientation of s. 

To see why this is so and to understand what tensors are all about, imagine surrounding 
yourself with a circular arrangement of balls lying in the (x-z) plane (see figure 3a) and 
initially at rest in your frame. Using (9) and (10), we can now write down how the separation 
between two balls along different directions changes. 

Since we are going to specify the direction, we will denote the separation simply by 
s. Along the z-axis, s grows according to (see (9)) d's =—-R3s5 = +2E%s, The plus sign 
indicates that the two balls move away from each other. In contrast, along the x-axis, s de- 
creases according to ds =—-Rils = — GH s, The two balls approach each other. (Similarly 
for two balls aligned along the y-axis.) (Note that acting on 5 on the right hand side of (9) 
by a tensor makes it possible for d's to change sign depending on the orientation of 5.) 

Inspecting figure 3a, you see why. Look at it as an observer on the planet. In the first case, 


one of the two balls, being closer to the planet, is falling faster than the other. Thus, they 


* When a pair of repeated indices, such as j in (9), is summed over, they are often said to be contracted with 
each other (as mentioned in a footnote in the preceding chapter) in the sense that this index no longer appears 
in the result, as shown by the left hand side of (9). 
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Figure 3 A falling ring of balls as seen by an observer on the planet (a), and 
as seen by an observer falling with the balls (b). 


are moving away from each other. In the second case, the two balls are coming closer due 
to spherical symmetry: they are both heading toward the center of the planet. As Newton 
pointed out, objects do not fall down to earth, but toward the center of the earth. 

In your rest frame (figure 3b) as you fall along with the balls, however, you see a tidal 
force acting on the circular ring (or a spherical shell if you prefer) of balls. The force 
appears to stretch the ring in the z-direction and to squeeze it in the orthogonal direction. 
When we come to Einstein’s prediction of gravitational waves in chapter IX.4, we will see 
that gravitational waves act on the detector according to equations analogous to (9) and 
(10). Note also for future reference that the tidal force R'/ (x) = 0'0/ V(X) involves two 
derivatives acting on the gravitational potential V(x). 


Invariant tensors 


In D-dimensional space, define the antisymmetric symbol ¢/*"" carrying D indices to 
have the following properties: 
golem gremlin and pl2-D 4 (11) 


In other words, the antisymmetric symbol « flips sign upon the interchange of any pair 
of indices. It follows that « vanishes when two indices are equal. (Note that the second 
property listed is just normalization.) Since each index can take on only values 1, 2,---, D, 
the antisymmetric symbol for D-dimensional space must carry D indices as already noted. 


For example, for D = 2, ¢!2 = —e2! = 1, with all other components vanishing. For D = 3, 
1) p g 


123 _ 9231 _ 9312 _ _ 9213 _ _ 132 _ _ 321 __ 1 with all other components vanishing (as 


€ 


was already noted in the preceding chapter). 
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Using the Kronecker delta and the antisymmetric symbol, we can write the defining 
properties of rotations R’ R = J and det R= 1as 


5ti RIK RA = 5k! (12) 
and 
eliken pip RJd Rkr Lee RU = oP Ss det R= ehare'’s (13) 


respectively. In (13) we used the definition of det R. (Verify this for D = 2 and 3.) 
Referring to (1), we see that we can describe 5/ and e'/‘"" as invariant tensors: they 
transform into themselves. For the rest of this text, we will often use, implicitly or explicitly, 
the notion of invariant tensors. 
For example, for 5O(3), using (13) you can show that e/«A'B/ = C* defines a vector 
C =A x B, the familiar cross product. Various identities follow. Consider, for example, 


lik pink = gilgin lm, gingil (14) 


To prove this, simply note that both sides transform as invariant tensors with four indices, 
and the symmetry properties (such as under i < /) of the two sides match. Contracting 
with A/, B’, and C”, we obtain an identity you might recognize: Ax (B xC = B (A . 
C)—C(A- B). 


Closing of Newtonian orbits once again 


We can now go back to the apparent mystery in chapter I.1, that the Newtonian orbits in 
a 1/r potential close. Out of the conserved angular momentum vector [=7x p=rx? 
(we are using the notation of chapter I.1; we have effectively set the mass to unity and 
hence the second equality) we can form the Laplace-Runge-Lenz vector L=ixrs+é ae 


Computing the time derivative ia you can verify (see exercise 4) that L is conserved for 
an inverse square central force. When 7 is perpendicular to 7, which occurs at perihelion 
and aphelion, the vector £ points in the direction of 7. We could take the constant vector 
Lto point toward the perihelion, and thus the position of the perihelion does not change. 
Hence the orbit closes. 

This result does not hold in Einstein gravity. The precession of the perihelion of Mercury, 
which we will discuss in chapter VI.3, is of course one of the classic tests of general 
relativity. 


Appendix: Two lemmas for future use 


There is a lot more we could say about tensors, but let me mention two simple lemmas that we will happen to 
need later. 

Let SY and A” be two arbitrary and unrelated tensors, symmetric and antisymmetric, respectively. Then 
SAU = 0. (See exercise 5.) 
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Tensors can have all kinds of symmetry properties, which you can explore on your own and in the exercises. For 


example, a totally antisymmetric 3-indexed tensor TU* 


is such that T flips sign under the interchange of any pair 


of indices (for example, T'* = —T/** = + T/**). A multi-indexed tensor can also have symmetry properties under 
the interchange of a specific pair, or may have no symmetry at all. Consider, for example, a tensor G/ symmetric 
under the interchange of the first pair of indices only, that is, G7 = G'*/. To be pedantic and absolutely clear, 
sometimes I like to put a space or a dot between the indices, thus G / or G*'*/ to separate the “special” pair 
from the other indices. For example, our tensor could happen to be G*"*/ = a“a' W/(X) for some vector field W/. 

Given G'""/, define Hk = GJ + GY, (Note that H*) = H*J' by definition, but H'/ is in general not 


equal to H**/.) Then we can solve for G in terms of H: 


Gig Li gkii 4 i-ik — yi-kiy (15) 
2 
(See exercise 8.) 
Exercises 
Dee wa (re He. 8) Shee natde page cule henge Vaud = 551 28) and 2 
efine V = ( 554) 5,2> > 9p )+ Show at if @ is a scalar, then (Vd)* =Vo-Vb=) , yk) an @ 
transform like a scalar. The Laplacian is defined by 
hae 2 2 2 
PS ee a 
a(x})2 a(x2)2 a(xP)2 


Show that the symmetric tensor S‘/ is indeed a tensor. 
Show that the infinitesimal volume element d?x is a scalar. 


Show that the Laplace-Runge-Lenz vector is conserved. 


Show that SA = 0 if SY is a symmetric tensor and A‘/ an antisymmetric tensor. 


Let T* be a totally antisymmetric 3-indexed tensor. Show that T has qD(D — 1)(D — 2) components. 


Identify the one component for D = 3. 


Consider for SO (3) the tensor T‘* from exercise 6. Show that it transforms as a scalar. 


Prove the lemma in (15). 


Verify (13) for D = 2 and 3. 


Note 


1. Foraconcise introduction to some of the group theory needed in theoretical physics, see QFT Nut, appendix B. 


From Change of Coordinates to Curved Spaces 


Euclidean spaces described with different coordinates 


In discussing rotations in chapter I.3, I emphasized that Euclid is defined by Pythago- 
ras. That the square of the distance between two neighboring points in 2-dimensional 
Euclidean space with coordinates (x, y) and (x + dx, y + dy) is given by ds? = dx* + dy? 
defines what we mean by Euclidean space. 

But even the familiar Euclidean space can look unfamiliar. You know well that in many 
physics problems, one set of coordinates is often much more convenient than another. 
Indeed, in discussing Newton’s planetary orbit problem in chapter I.1, we changed from 
Cartesian* coordinates (x, y) to polar coordinates (r, 0), with x =r cos@ and y=r sin 0. 
Differentiating, we have dx = dr cos 9 — r sin 6 d6 and dy =dr sin 6 +r cos 0 d@, so that 


ds* = dx’ + dy* = (dr cos6 —r sin 6 d6)* + (dr sin@ +r cos 6 d0)* = dr? + r2d6? (1) 


We are free to make any coordinate transformation we feel like. Consider the most gen- 
eral transformation x = f(u, v), y= g(u, v). Then dx = f,(u, v)du + f,(u, v)dv where 
fir= af and so on, and dy = g,(u, v)du + g,(u, v)dv. Just plug in to obtain ds* = dx* + 
dy? = (f? + g?)du* + (f? + g?)dv? + 2(f, fy + 8u8y)dudv. With a gunky choice of f and 
g you will end up with a mess of a coordinate system that would only make your life 
miserable. (Note that even the innocuous change x =u + v and y=v leads to ds? = 
du? + 2dv* + 2dudv with the rather unpleasant dudv cross term.) Of course, it was discov- 
ered long ago that by choosing f(u, v) =u cos vand g(u, v) =u sin v, we can get rid of the 
cross term. By now probably all the nice choices for f and g have already been published 
by someone. 


* When I was in high school, I got the erroneous impression that the notion of coordinates originated with 
Descartes. In fact, by the time of Ptolemy, astronomers in the West certainly had latitudes and longitudes. In 
China, Chang Heng, roughly a contemporary of Ptolemy, was said to have derived, by watching a woman weaving, 
a system of coordinates to map heaven and earth with. The Chinese words for latitudes and longitudes, “jing” 
and “wei,” are just the terms for warp and weft in weaving. 
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I presume that you also know how to go from Cartesian coordinates (x, y, z) in 3- 
dimensional Euclidean space E? to spherical coordinates (r, 6, yg), with x =r sin 6 cos g, 
y=rsin 0 sin g, z=r cos 6. The more-than-familiar (and who can blame you if you have 
been in it all your life?) E3 could be described by either ds? = dx? + dy? + dz? in Cartesian 
coordinates or by 


ds* = dr* + r7d0* + r? sin” 6d’ (2) 


in spherical coordinates. 


From Latin to Greek 


We can systematize and generalize this to D-dimensional space easily enough. In previous 
chapters, I used Latin letters for the index on the coordinates. I now switch, for later 
convenience, from Latin to Greek and call the coordinates x” = (x!, x*,---, x?). Then, 
for Euclid’s spaces E?, Pythagoras said that ds* = ey (dx")?, 

We write this in the fancier form ds? = er, ees, &,ydx"dx” by introducing a D- 
by-D matrix g whose diagonal elements are all equal to one and whose other elements 
are all zero, the famous matrix known far and wide as the identity matrix. To repeat, the 
indices jw, v run over 1, 2,---, D, and g,,, is defined by g,,,, = land g,,,, =O if A v. (In 
other words, it is just the Kronecker delta introduced in chapter I.3: g,,,, = 6,,,.) Thus, in the 
double sum for ds”, the terms with jp 4 v drop out and we are left with ds* = ae i. (dx). 

Now a word on notation. In the chapter on rotation, I have already introduced this 
expression for ds”, and furthermore, the repeated index summation convention. Einstein 
suggested that between us friends we could omit the cumbersome summation symbol 
and agree that ifan index is repeated, then it is to be summed over. Thus, we suppress the 
double summation ee eet and write simply ds? = g,,,dx"dx". Here and v are both 
repeated and hence summed over. Unless there is a risk of confusion, no more summation 
symbols! 


The metric 


The matrix g,,, is called the metric, a word meaning measure, as in geometry, the science 
of measuring the earth. We use the metric to measure space. This step of introducing a 
metric for Euclidean spaces seems like one of those totally senseless moves that certain 
academics like and publish. In the discussion just given, the metric is simply the identity 
matrix. 

But as soon as we change coordinates, the metric is no longer so simple. As we have 
already noted in (1), with polar coordinates, the plane E? is described by a metric with 
Spr = 1, 99 =r, and g,9 = 0 = gg,. With spherical coordinates, E+ is described by a metric 


2 


with g,, = 1, g99 =r’, Soo =F sin? 6, with all other entries zero, as in (2). 
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In both examples, the metric is not given by the identity matrix. Furthermore, the 
metric g,,,(x) varies from point to point. For example, g,,, depends on both r and 0. 
Note, however, that for these examples, the metric is diagonal. (That is why polar and 
spherical coordinates are so popular!) In general, the metric g,,,, need not be diagonal 


(as shown in the example ds? = du? + 2dv* + 2dudv, for which g,,=1, g))=2, &uv= 
8vy = 1). However, in this text, for the sake of simplicity, we will mostly stick to metrics 
that are diagonal. Furthermore, since dx"dx? = dx"dx", the metric is symmetric under 
interchange of indices: g,,,, = g,,,. It goes without saying that the reader encountering all 
this for the first time should verify everything I say. 


Lower indices appear 


The attentive reader might have noticed that lower indices have sneakily appeared! The 
metric g,,, carries lower indices, while dx“ carries an upper index. When I taught Einstein 
gravity, the appearance of upper and lower indices invariably confused some students. In 
this text, I will try to motivate the point of introducing upper and lower indices, more 
from a utilitarian, rather than a profoundly mathematical, point of view. My strategy is to 
introduce this business of two kinds of indices in stages. 

At this stage, the motivation, to put it bluntly, is that we just feel like it. But this 
caprice immediately leads to a useful rule. In the Einstein repeated index summation, 
we will insist that when we sum over a pair of repeated indices, one of them must be 
upstairs, the other downstairs. This is manifestly, and trivially, satisfied by the only example 
ds? = g,,,(x)dx'dx” we have encountered thus far. The whole business of two kinds of 
indices may seem unnecessary at this point, but later, you will see that the distinction 
between upper and lower indices becomes essential, or at least highly useful. 

A word about terminology: Some authors refer to ds* = 8yy(x)dx"dx” as the square 
of the line element, reserving the term metric for the object g,,,(x) contained in the 
line element. I find it convenient to abuse terminology and simply refer to both as the 
metric. 

Let me mention one trivial point, but one with the potential for confusing beginners. 
Some years ago, when | surveyed the students in my class for points of confusion, one 
student told me that for quite a while he did not realize that g,,,,(x)dx?dx", g-y (x)dx$dx¥, 
and so on, all denote the same thing! Perhaps this is because the summation symbol 
has been suppressed: the same student could recognize that pad = ey Byy(xdxtdx? = 


4 Ben Spy (x)dxPdx* = SS) een Scy(x)dxbdx”. 
p be g v 


Change of coordinates, curved space, and curved spacetime 


We all know that in Euclidean 3-space, if we restrict r to be equal to a, we would find 
ourselves on the surface of a sphere of radius a. In other words, the set of points at a 
distance a from the origin form a sphere with radius a. 
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This procedure gives us an easy way to determine the metric on a sphere. Simply 
take the metric (2) ds* = dr* + r2d6? + r? sin? 6dg? and set r = a. Then dr = 0, so that 
ds? collapses to a*(d0? + sin? @dg?). From 3-dimensional flat space we have “lost” the 
coordinate r and gone to a 2-dimensional curved space with coordinates x“ = (x!, x?) = 
(0, y). Without loss of generality, we can take a as the unit of distance and set a = 1. So, 
on the unit 2-dimensional sphere S? 


ds* = d0? + sin? 6dg? (3) 


with a metric given by g11 = 899 = 1, 822 = 8yy = sin? 6, and g1) = Soy = 821 = 89 = 9. 

The take-home message here is that curved space is just a skip and a hop away from the 
familiar change of coordinates. This is fortunate for students of physics: when you learned 
to change coordinates, you were actually also learning about curved spaces. We are now 
going to develop a general formalism for changing coordinates. Even though you already 
know how to change coordinates, it pays to learn this formalism, because we can also use 
it to study curved space and curved spacetime (which, as you have surely heard, plays a 
central role in Einstein gravity). 

Change of coordinates, curved space, and curved spacetime: basically the same deal, as 
you will see. 


How do we know whether a space is curved or not? 


This raises an exceedingly interesting and crucial question: given a space with the metric 
8, (x), how do we know whether it is curved or flat? 

A complicated looking metric does not necessarily mean that the space is curved, since 
somebody could have simply chosen an especially gunky coordinate system. It could be flat 
space in disguise. To forcefully bring home this point, I invite you to consider ds* = (1+ 
u2)du2 + (1+ 4v2)dv2 + 2(2v — u)dudv and ds? = (1+ u*)du* + (1+ 2v*)dv? + 2(2v — 
u)dudv. One describes flat space, the other a space that at some points is violently curved. 
Which is which? 

Puzzled, you reply: “How could I possibly tell?” 

That’s in fact the correct answer at this stage of this discussion. The two metrics I just 
gave you look almost identical except for one single 2 — 4. In one of the most famous 
episodes in mathematics, Carl Friedrich Gauss (1777-1855) solved this problem for 2- 
dimensional spaces. His work was then generalized by his student Bernhard Riemann 
(1826-1866). Later, in chapter VI.1, given any metric in any number of dimensions, you 
will be able to calculate, and even better, to train the computer to calculate, something 
called the Riemann curvature tensor, which will tell you once and for all if the space is flat 
or curved. No more thinking involved! Gauss and Riemann did it for you. 

But for now, let me ask you to think about two simple examples in good old 2- 
dimensional space, for which our intuition is allegedly pretty good. We know that ds? = 
dr* + r*dé? describes flat space. Consider 


ds* = dp’ + sin’ p dé? (4) 
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Is the space being described flat or curved? Or consider the space described by 
ds* = cos’ p dp” + sin” p dé” (5) 


Is it flat or curved? You should think about this before reading on. The answers are given 
in appendix 1. 

Remember the civilization of mites in the prologue? You are in the same position as 
the mite professors of geometry: they can measure the distance between infinitesimally 
separated points, and from that they have to figure out whether their world is curved. We 
will face the same problem as the mites when we get to cosmology in parts V and VI. 


The logic of differential geometry 


Differential geometry, as developed by Gauss and Riemann, tells us that given the metric, 
we can calculate the curvature. The logic goes as follows. The metric tells you the distance 
between two nearby points. Integrating, you can obtain the distance along any curve joining 
two points, not necessarily nearby. Find the curve with the shortest distance. By definition, 
this curve is the “straight line” between these two points. Once you know how to find 
the “straight line” between any two points, you can test all of Euclid’s theorems to see 
whether our space is flat. For example, as described in the prologue, the mite geometers 
could now draw a small circle around any point, measure its circumference, and see if 
it is equal to 27 times the radius. (See appendix 1.) Thus, the metric can tell us about 
curvature. 

Take an everyday example: given an airline table of distances, you can deduce that the 
world is curved without ever going outside. If I tell you the three distances between Paris, 
Berlin, and Barcelona, you can draw a triangle on a flat piece of paper with the three cities at 
the vertices. But now if I also give you the distances between Rome and each of these three 
cities, you would find that you can’t extend the triangle to a planar quadrangle (figure 1). So 
the distances between four points suffice to prove that the world is not flat. But the metric 
tells you the distances between an infinite number of points. 


Berlin 


Barcelona 


Sig NK 


Rome 


Figure 1 The distances between four 
cities suffice to prove that the world is 
not flat. 
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Figure 2 The Poincaré half plane (including pictures of 
Poincaré). 


Poincaré half plane 


Let me tell you the interesting example of the Poincaré half plane.* Consider the upper half 
plane covered by the usual coordinates (x, y) with y > 0, but endowed with the peculiar 


metric 
_ dx? + dy? 


7) (6) 


ds* 


i 


2° The space is translation invariant in x, thatis, 


See figure 2. In other words, g,. = 8yy = 
Suv(* +4, y) = gy (x, y) for any a, and the space at one value of x looks exactly the same 
as the space at some other value of x, but it is not translation invariant in y. Evidently, 
this space has an edge at y = 0. Consider a standard ruler with length /, pointing along 
the x-axis, so that the two ends of the ruler are separated by Ay = 0. According to (6), 
1 = As = Ax/y. Take the ruler closer and closer to the edge. The ruler covers less and less 
Ax as you approach the edge: indeed Ax = yl > 0. as y > 0. It would appear that your 
tuler is shrinking relative to the milestones the inhabitants of this world have helpfully 
erected at fixed values of x. (We point the ruler along the x-axis merely for pedagogical 
clarity; in fact, we would reach the same conclusion regardless of the ruler’s orientation. 
For example, a ruler pointing along the y-axis would cover Ay = yl, which > 0 as you 
approach the edge.) 

In reality, the edge is infinitely far away, since the actual distance between the points 
(x, y,) and (x, 0*) along the line of constant x is given by 


Vx 
[as= [> aviy=togty./0%) > 0 
ot 


* First discovered by the Italian mathematician Eugenio Beltrami (1835-1899) long before Poincaré. 
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Imagine, as ina sci-fi movie, finding yourself in such a weird space. Before reading on, 
can you figure out the straight line joining two points, say (0, y,,) and (x,., y,)? The phrase 
“straight line joining two points” is of course, as already mentioned, defined to be the path 
of shortest distance between the two points. 

If you were to go along the line of fixed y = y,, the distance would be f ds = ([}* dx)/Yx 
= x,/y,. Clearly, you could do better. To economize on ds, you should curve away from 
the edge at y = 0 into the region of larger y and hence smaller ds for a given stretch of 
(dx, dy). At this point, we can only discuss this curved path qualitatively. In chapter II.2, 
we will learn how to determine this curve. In this sci-fi movie, you see your favorite person 
in the distance. You run to him or her, but you sure don’t want this person to think you 
are an idiot, which you would be if you ran along a “straight” line expressed in (x, y) 
coordinates. You try to be like Feynman the lifeguard in the prologue, so you follow the 
curve that minimizes [ ds. 

In fact, this simple example of the Poincaré half plane is behind a recent advance in 
quantum gravity and string theory, known as AdS/CFT (anti de Sitter/conformal field 
theories). See chapter I[X.11. 


A pervasive theme of theoretical physics 


The central message here is that coordinates do not have intrinsic geometric significance. 
If you use the coordinates x“, somebody else could perfectly well use coordinates x“, 
with the two sets of coordinates related by the D functions of D variables x’“(x) and their 
inverses x(x’). 

Again we come to the pervasive theme of theoretical physics already alluded to in chapter 
I.3: transformations. Given a set of physical laws, we ask: What are the transformations 
that leave these laws unchanged? How do the relevant physical quantities transform? What 
combinations of these quantities are left invariant? Here we are concerned with coordinate 
transformations. 

Rotations furnish the prototypical example. Indeed, we see that we can think of rotation 
x'# = R*x” as a special example of a class of coordinate transformations, namely those 
that are linear. Notice that, obeying our rule of only summing an upper with a lower index, 
we moved the column index on the rotation matrix R downstairs. We also graduated from 
Latin to Greek. 


General coordinate transformation 


In elementary discussions of change of coordinates, brute force substitution, as in (1), 
suffices, but as you will see, it pays to formalize the steps involved, so that the formalism 
applies to more general situations. I have already advertised that many concepts involved 
in changing coordinates are also relevant for discussing curved spaces. 
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We will use as our canonical example the transformation between Cartesian and polar co- 
ordinates, between (x! =x, x? = y) and (x! =r, x’? =6). We have the two functions x! = 
x =r cos@,x*=y=rsin @ and their inverses x/'=r = /x? + y2, x? =6 = arctan +. 
Note that in many cases, x“ and x’ all have common colloquial names, namely x, y and 
r, 0, respectively, in this example. We will pass freely without any further remark between 
the “academic” names x“ and x’ and their street names. 

A word about notation: I trust the reader not to be confused by unavoidable but trivial 
notational abuse. For example, here the letter x does double duty: generically, it represents 
x" and, in the special case of the standard Cartesian coordinates, also the first component 
of x". 

In general, coordinate transformations are definitely nonlinear. For example, the polar 
angle x”? = 6 = arctan » is defiantly not a linear function of x and y. However, and this is 
a crucial point, the infinitesimals dx” do transform linearly. Indeed, applying elementary 
calculus, we have 

dx'¥ = oe pe (7) 

ox? 

For the sake of the reader seeing this for the first time, let’s go slow here. Calculate dx”, 
for example (supposing that D > 3): 


B B B D B B 
dt ay Pad 4p ae = ae = a (8) 
x ' x” 


where in the last step we invoke the repeated index summation convention and drop the 
summation sign. This is of course just (7) with yw = 3. 
It is useful to define the matrix 


ax’ 


ox” 


SH (x) = 


(9) 


which we could regard as a matrix with a row index yz and column index v. Then we can 
write dx’ = S$“ (x)dx”. In accordance with the repeated index summation convention, the 
index v is summed over. Note that we put the second index on S downstairs to obey the 
rule of only summing an upper with a lower index. 

The infinitesimals dx’ and dx” are related linearly by the matrix S“ (x), just as dx'“ = 
R* dx” for rotation. The big difference from simple rotation is of course that S“ (x) depends 
on x. (This fact leads to all the mathematical complications in Einstein gravity that you may 
or may not have heard about.) 

In the polar example, we have 


Aap. a tage OI 
[x2 + y? x2 + y? 
so that 
gl — x gl _ y ; 2 =. and s? x 
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(Note that, if we think of x, y, and r as having dimensions of length, the different 
components of the matrix S do not even have the same dimension—hardly surprising, 


1_ + and x’? =6 have different length dimensions.) 


since x’ 
Going from primed to unprimed coordinates, we expect to encounter the inverse of (7): 
duh, we are going the other way. Again, using elementary calculus, we write (with the 


index p summed over as per the repeated index convention) 


ox 
1 1p —1ly\pu ! 1p 
dl = dx? = (SW (a' idx (10) 
where 
ox 
ly 
Sv) = anh (11) 


Since we are simply performing the inverse transformation, S~! in (10) has to be, as we 
just said, the inverse of the matrix S introduced in (7). Nevertheless, let us pause to verify 


axl ax!P axl 
= 37 ae = gaev = 5, where the 


Kronecker delta is defined in analogy with the Kronecker delta used in (1.3.16) for rotations, 


the obvious. Use the chain rule of calculus: (S oe, 


namely that 6“ = 1if « = v and 0 otherwise. In other words, the Kronecker delta is just a 
fancy way of describing the identity matrix.* Thus, we have verified that, indeed, S~!5 = I. 

In the simple polar example, (10) just consists of dx = cos 6 dr —r sin 6 dé and dy = 
sin 6 dr +r cos 6 d0. Thus, (S~')!, =cos6, (S44, =—r sind, S4 =sin 6, and S4,= 
rcos@. 

Let me also quickly put another potential source of confusion to rest. I have written S as 
a function of x and S~'as a function of x’, but of course any function of x could be written 
as a function of x’ and vice versa. 


A general formalism for changing coordinates 


We want to know how the metric g,,, transforms when we change from one set of coor- 
dinates x” to another set of coordinates x’. We know that the square of the infinitesimal 
distance between two neighboring points does not depend on our choice of coordinate 
system: 


ds? = gyy(x)dx"dx” = gl, (x’)dx'Pdx'” (12) 


This requirement fixes the relation between g,,,, and 8 >? a8 we will now see. 

Let me reassure the reader seeing this for the first time and finding all these indices 
a bit fearsome. Keep in mind that what we are doing is nothing but a more general and 
compact packaging of something that is conceptually quite simple, almost trivial, if you 
have mastered calculus. For this class of readers, I will, in what follows, jump back and forth 
between the general and the specific. At each step, we will review the familiar change from 


* Note that the 6 here carries one upper and one lower index. 
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Cartesian (x, y) to polar (r, 0) coordinates, and point out how this illustrates the general 
point. Indeed, in this example, (12) is just (1), namely ds* = dx? + dy* = dr? + r7d6? 
written in index notation. 

Eliminate dx"dx” in io in favor of dx’Pdx'” using (10), and obtain 


& Ox” 


IP Jvlo _ 
(x!)dx'?dx = Bo ap ane 


ee ——dx'?dx'” (13) 


Note that all repeated indices in (13) are summed over. To help the abecedarian reader, I 


now restore temporarily the summation symbol and write the right hand side of (13) more 
slowly and explicitly as 


DL sntontvtts’= TY sat (DY ro") (Da) 
wo» v oO 
-—LUL Lew; Pen ia ——d dx'’dx'” 


I hope you see the advantage of following Einstein and dropping all summation symbols. 
Since the infinitesimals dx’? and dx’? in (13) are arbitrary, we can identify the coefficient 
of dx'Pdx'? to find 
M Qx” 


x!P Ox!F 


ae (x') = gy yr yo (14) 


This tells us how the metric 2 , (x’) in the primed coordinate system is related to the metric 
8, (x) in the unprimed coordinate system. Note that we are relating the two metrics at the 
same point P: x and x’ are merely the different coordinate values corresponding to P. 
Given the metric in one coordinate system, we can work out the metric in some other 
coordinate system using* (14). For example, apes to the polar case, one of the equations 
in (14) says that g(x’) = g11(25 52+ 827(# 2 5)2=r2. (As remarked earlier, we could 


write this also as g/,(r, 0) = xx (3 x2 4 ree ¥)¢ — r2.) You should work out the other 
components of g’ in this way. As aecties exercise, check the formalism here for spherical 
coordinates. 


Upper versus lower 


In going from unprimed to primed coordinates, we have (7) dx'“ = S“(x)dx” and (14) 
ge: oh) = By (x)(S eee me er Practical calculation aside, we notice that, in going from 
unprimed to primed coordinates, an upper index jz transforms with S“ and a lower index 
p transforms with the inverse matrix (S = An important conceptual point, but clearly 
this must be the case: for ds* = 8yv(x)dxdx” to remain invariant, the upper and lower 


* In practice, it is actually somewhat more direct to apply calculus to (12), rather than to use (14): again for 
the polar example, going the other way from (r, 9) to (x, y) for a change, we have 


2 2 
ds? =dr* + r°de* = (“ as 22) + 2 (= =%) = dx* 4 dy* 
r r 
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indices must transform oppositely, so that the S and the S~! can knock each other out. 
Fancy people call the upper index contravariant and the lower index covariant—I can never 
remember which is which. If you like big words, go for it. 

From (9), we notice another important point. Look at S4(x) = oes the upper index v 
on the right hand side emerges as a lower index on the left hand side. This makes a certain 
amount of “intuitive” sense: on the right hand side we sort of “divide” by dx’. We will use 


the shorthand notation for the differential operator 


a 


ox” 


0, = 


v 


(15) 


To check that it is consistent for 0, to carry a lower index, note that, using the chain rule, 


= a ax” oO 


Mo Ox'# ax’ Ax” 


=(5"y 9) (16) 


This is precisely how a lower index should transform, with the inverse matrix S~!. Notice 
also that this generalizes what we learned in chapter I.3 about how a transforms under 
a rotation. 

In the two preceding paragraphs, we talked about going from unprimed to primed 
coordinates. Going from primed to unprimed coordinates, we have of course, as already 
remarked, the inverse transformations: dx” = Coe (x')dx"* and g,,(x) = Se (x) S4, SS. 
The roles played by S and S~! are interchanged: an upper index transforms with S~! and 
a lower index with S. 

It is also illuminating to write the transformation law of g,,, in terms of matrices. 
Given a matrix M", define the transpose by (M as = M". Then we can rephrase the 
transformation law of the metric Bie (x)= Buv(x)(S~) (S71) a as 


g(x’) = (S71) e(x)S (17) 


Now you can make contact with something you know only all too well, rotations. Let 
us see how the formalism just developed works for rotations. For a rotation matrix R, 
R’R= RR’ =1,so that R-!= R’.. Thus, if S happens to be a rotation matrix R, that is, if 
the transformation from unprimed to primed coordinates is a mere rotation, then in this 
special case (and only in this special case) (S~!)7 = (S7)" = S. The transformation law (17) 
of the metric collapses to J] = $1 S~1, since in this case the metric is just the identity matrix 
T in both the unprimed and primed coordinates. In other words, rotations are defined as 
linear coordinate transformations that leave the Euclidean metric invariant, as we learned 
in chapter 1.3. 

Regarding g,,, as a matrix, we naturally think of its inverse, which we denote by g/”. In 
other words, the matrix g“” is defined by 


gay, = Oy (18) 


Here the index v is summed over. (Since the inverse of a symmetric matrix is also 
symmetric, g“” = g’”.) Note that, in accordance with our rule that we are allowed only 
to sum over a lower index with an upper index, the inverse of the metric g“” carries two 
upper indices. Normally, we denote the inverse of a matrix M by M~1. However, here it 
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is customary to omit the superscript (—1): the placement of the indices suffices to tell us 
that g,,, is the metric and g“” is its inverse. 
Most of the metrics we deal with are diagonal. Ifso, then the inverse metric is a cinch to 
write down. For example, for spherical coordinates g’” = 1, g°° = 1/r?, g?? =1/(r? sin? @). 
Not surprisingly, when we go from unprimed to primed coordinates, while the metric 
8,» transforms with S ~1 the inverse metric transforms with S: 


g(x) = See (x) (19) 


(This sounds entirely plausible, since g"” is, after all, the inverse of the metric, and so 
should transform oppositely as the metric, but you can also easily check that the g’"” given 
in (19) is indeed the inverse of g/,,: @/”(x')a,(x’) = (SHS”, 8°? (x) (Six Qs) Sy) 
= SH g(x) Sac (x)(S- YY, = SH (STH, = 6h.) 

A word or two of encouragement: I agree that for the reader seeing this for the first 
time, all these indices might look overwhelming, but it is actually quite simple, once you 
get used to how the indices hang together. You might also remind yourself that we are doing 
nothing more involved than changing coordinates, but instead of doing it one coordinate 
system at a time as in elementary treatments, here we want to do it in general, for any 
coordinate system. In any case, don’t worry about it if you are having some difficulty. As I 
said, we will discuss all this in more depth later. 


Vectors, scalars, and tensors 


I will now tell you what a vector is in curved coordinates (for example, spherical coordi- 
nates). Eventually, we will discuss vectors, tensors, and all that good stuff in curved space 
in great detail, but for now let’s do it on the quick. The easiest path is to simply generalize 
the familiar notions of scalars and vectors under rotation. We say that W(x) is a vector 
with an upper index if it transforms just like dx” does. In other words, 


WE (x!) = SE (xX) W(x) (20) 


when we go from unprimed to primed coordinates. Note that we are simply generalizing 
(1.3.18), which describes how a vector field transforms under a rotation. 

Indeed, we could now take over our discussion of vectors and tensors in chapter 1.4 
almost in its entirety. The one novelty, as already noted, is that we now have upper and 
lower indices, or as the Jargon Guy would say, contravariant and covariant indices. In 
contrast to a vector with an upper index W“, which transforms like dx“, a vector with a 
lower index W,, transforms like 0,,: 


Wy") = We (x)(S~,@) (21) 


I like to think of dx and 4,, as the two “primeval” vectors. 
A scalar field ¢(x) transforms like ¢'(x’) = ¢(x), again simply generalizing a basic 
notion we learned in studying rotation. 
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From (16), we learn that a (x') = (S1)" Ab (x). We say that 0,,¢(x) transforms like 
a vector with a lower index. As I said, this subject will be developed in more detail in part 
V, but at the moment all we need is that W“(x) and 0,,(x) transform oppositely, with 
S and S|, respectively. Thus, W“(x)0,,6(x) (with the index 4 summed over, of course) 
transforms like a scalar. It’s easy to check: 


W'(x)) 6! (x') = SMW*(X)(S7Y) 8.6 (2) = W"(X)9, 0) 


where we used SESaI) = 

Given V,, and W“, you can verify easily that Vi (x) W(x") = V,,(x) W(x): the Sand S os 
in (20) and (21) knock each other off. Colloquially, we say that summed indices disappear. 
Indeed, the astute reader will notice that we just showed this in the preceding paragraph, 
since V,, and 0,,@(x) transform in exactly the same way. 

The concept of tensors arises naturally, just as in chapter 1.4, but now tensors can carry 
both upper and lower indices. For example, the tensor T“”* transforms like 


TANS SE Sse (Se ke) (22) 


In other words, T/”* transforms as if it were composed of the product of four vectors 
WEV'UAY,. 

Again, summed indices disappear. For example, setting 4 equal to w in (22) and sum- 
ming, we obtain (ie ie (x =S pe v T°" (x). In other words, ‘ee VA is actually a tensor with 
two upper indices. 

A basic “theorem” is that we could use the metric to lower indices and its inverse 
to raise indices. Given a vector with an upper index W’”, the claim is that the object 
V,, = 8,» W” transforms like a vector with a lower index. Simply plug how g,,,, transforms 
into (20) to obtain V’ = @/  W? = 8,,,(S*)"(S-1)",S°,W* = 8, (SW = (SV, 
in agreement with (21), as claimed. Indeed, this little exercise shows that there is no need to 
introduce another letter: itis customary to simply write W,, = g,,,W”. I willleave it to you to 
verify that, similarly, given a vector with a lower index U,,, the object g“”U, transforms like 
a vector with an upper index. Since tensors transform as if they were products of vectors, 
we can use the metric to lower indices and its inverse to raise indices on tensors at will. 

Here is a simple practice problem involving upper and lower indices. Solve the equation 
SpA” = B,, for A. Multiply by g°“: we obtain g°"g,,,A° = 50 AP = AP = BB, In other 
words, A° = g°“B,,. We now rename the index o and call it p and write A? = g?"B_,,. With 
practice, you could simply omit the intermediate steps and go directly from g,,,A° = B, 
to A? = g’#B_,. We may think of this colloquially as flipping the metric from the left hand 
side to the right hand side, where it flips over to become its inverse. Some readers might 
think that I am belaboring the obvious, but then they would not believe how some students 
flip out if I omit the intermediate steps when I teach. Indeed, it really is clear if we regard 
g as a matrix, and A, B as vectors: gA = B implies A = g™'B. 

Regard this as a first brush with tensors. Much more later. 
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Area and volume 


As you will see, in studying relativity, we constantly ask how various quantities transform. 
The importance of transformation is in fact central to theoretical physics. In elementary 
physics, this importance is disguised to some extent, as the quantities the student is 
likely to encounter have already been formulated to transform properly. In more advanced 
physics, quantum field theory, for example, the requirement of transforming properly 
often defines the concept in question. 

Let me give you an elementary example that beginning students can all relate to: the 
concept of area and volume. In flat space, the infinitesimal volume element is given by d?x 
in Cartesian coordinates, but what is it in spherical coordinates? You probably obtained 
the answer in a course on integral calculus by drawing a curved infinitesimal volume 
element bounded by the eight points (r + n,dr, 6 + ngd@, y + node), where n,, Ng, Np = 0 
or 1, and approximating it by a rectilinear volume. Clearly, d*x cannot be a volume 
except in Cartesian coordinates. In spherical coordinates, dx = drd@dy does not even 
have dimensions of length cubed. The power of the metric tensor formalism developed 
here is that it will tell us what the correct volume element is in any coordinates, even 
those for which g,,, is not diagonal, and in any dimensions (for D = 1, we are talking 
about a length element, for D = 2 an area element, for D = 3 a volume element, and 
SO On). 

The point is that d?x does not transform properly under a coordinate transformation 
x — x’, This is just a fancy way of stating the obvious, dxdydz and drd6d¢ are surely not 
equal to each other. (They don’t even have the same dimension!) Indeed, you know that a 
Jacobian J is needed when you change integration variables. You learned in calculus that 
d?x =d?x'J, where J is the determinant of the D-by-D matrix ax. 

Go back to (14), which shows how the metric transforms, and take the determinant of 
that equation. Use the fact that the determinant of a product of matrices is the product of 
their determinants. You obtain g’ = gJ*, denoting the determinant of g,,,, regarded as a 
D-by-D matrix by* g and similarly for the primed quantities. 

Putting these two facts together, we obtain 


/ 
dx /g=dx' I /g= ave = d?x'/g (23) 


In other words, d?x ,/g is invariant under coordinate transformation. We learned that it 
is not d?x, but the combination d?x /g, that has intrinsic geometric significance as a 
volume element, intrinsic in the sense that it does not depend on our coordinate choice.7 
I will show you presently that this reproduces the volume element you have known since 


childhood. 


* We have also used the letter g to refer to the metric, but the context should remove any potential confusion. 
¥ The Jargon Guy would say that (23) defines \/g as a scalar density. A tensor density is then defined as a 
tensor multiplied by ./g. For me, the less jargon the better. 
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In spherical coordinates, ds? = dr? + r7d6? + r? sin* 6dg? with (x1, x*, x3) =(r, 0,9). 
We regard the metric g,,, as the matrix 


0 0 rsin?6 


You can calculate the determinant of this matrix by eyeball: g =r‘ sin” 6. In Cartesian 
coordinates, the metric is the identity matrix, and so you don’t even need to exercise 
your eyeball to calculate the determinant: g = 1. Thus, applied to these two coordinate 
systems, the statement (23) that d?x ,/g is an invariant amounts to saying that dxdydz = 
drd6dg r? sin 6, as you have always known.* 

You should now reproduce all the area and volume elements in all the coordinate systems 
you know. The important point to appreciate, as I have already stressed, is the generality 
of (23). In exercise 11, you will see that this formalism enables you to calculate the area of 
higher dimensional spheres. 

As a preview of exciting things to come, let me tell you that Einstein’s field equation 
contains a “famous” 5 that both Einstein and Hilbert failed to obtain in their separate first 
tries. At this stage, I will whet your appetite with a cryptic remark that the square root in 
(23) is responsible for this all-important 5 in the history of physics. (See chapter VI.4.) 


Local versus global 


A couple of remarks are in order. 

Typically, we need more than one set of coordinates to cover an entire space. The sphere 
provides an example. The (0, g) coordinates (aka latitude and longitude) fail at the north 
pole and the south pole. (Didn’t you ask your teacher what the longitude of the north 
pole was? The correct answer is that it is undefined.) One symptom of this failure is that 
89 = sin? @ vanishes at the poles. Since the coordinate system failst at only two points, 
most physicists simply ignore this failure as a technicality and happily use the spherical 
coordinates until they run into trouble,! which is almost never. In this example, clearly 
nothing intrinsically bad happens at the two poles: on the sphere, every point is as good as 
any other. All we have to do is to set up a coordinate system with coordinates (6’, gy’) with 
some point (other than the two poles) and its antipodal point playing the role of the north 
and south poles for this primed coordinate system. 

In general, we may need several “coordinate patches” to cover the entire space. One patch 
must overlap with another such that in the overlap region, the two sets of coordinates are 
related to each other by smooth differentiable functions. 


* Note that although we are using a formalism appropriate for curved space to derive these expressions, when 
we apply them to spherical coordinates, we are dealing with flat space, of course. 
¥ Incidentally, this also implies that g vanishes and hence the inverse metric g” fails to exist at that point. 
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Similarly, the polar coordinates are, strictly speaking, not defined at the origin, since ggg 
vanishes there. 

You might have also noticed that in the metric ds* = dr* + r*d6?, and in (4) and (5), 
I have implicitly assumed, as is customary, that (r, 0) and (r, 0 + 277) describe the same 
point. In other words, 6 is an angular variable. But nothing in the metric itself tells us that. 
By its very nature, the infinitesimal distance between two nearby points cannot tell us 
anything about the global character of the space. The identification 9 = 6 + 27 is a global 
statement. 

Indeed, we can imagine cutting out a wedge of angle a from the flat piece on which we 
have mentally drawn the polar coordinates and gluing the two edges together. Evidently, 
this forms a cone characterized by the angle a. In other words, we now identify (r, 6) and 
(r, 0 + 22 — a). Remember the ant in the prologue? 


A quick summary 


Since this has been a fairly long trek, it may be helpful to have a brief summary. The 
metric was introduced, and you learned that the metric enabled you to calculate the 
curvature of space (only in principle, not yet in practice, for any dimensions). The Poincaré 
half plane gave an interesting example of how a simple looking metric could describe a 
strange space. Along the way, we also got acquainted with upper and lower indices, and 
understood how vectors and tensors could be defined by how they transformed under 
coordinate transformation. In particular, we figured out how the metric transformed. As 
an application, we determined the generalized volume element for any curved space. 


Appendix 1: Mite geometers draw circles and dream of black holes 


Go back to the mite professors of geometry in the prologue, busily drawing circles of radius € around any given 
point P and calculating the curvature R by evaluating R = limyadius—+0 ie a Sette), If the space is 
flat, R vanishes. 

Let us see how this works for the metric (4) with 6 and @ + 2z identified. The arithmetic simplifies enormously 
if we pick P to be the origin. The distance from the origin (0, 0) to the point (€, 0) is given by Io ds= Io do =e. 
In other words, the set of points with coordinates (€, 0) form a circle. Then the circumference is given by 
f dO sin € = 2n sine. Thus, R = lim, _,9 $a sine) = 1 and the space is curved. Perhaps you have already 
recognized that the metric (4) is just the metric of the unit sphere (3) with 9 and @ disguised as p and 0, 
respectively. 

As for the metric (5), it turns out that the space it describes is flat. The distance from the origin (0, 0) to the 
point (€, @) is now given by {) ds = fy cosrdr = sin e. Now R = 0 at the origin. Indeed, you might have seen 


immediately that the metric in (5) is just ds* = dr? + r2d6? with the change of variable r + sin p. 

You are now catching on and can tackle any metric. An interesting class of D = 2 metric is given by ds? = 
f (r)dr? + r7d0?. The geometrical meaning is clear. We have filled space with circles (consisting of the points 
(r, 0) with @ varying from 0 to 27) centered around the origin. For each circle, the radius is equal to Io dr’ J f(r), 


while the circumference is given by ie ™ dOr = 2r. Asimple calculation shows that for f(r) ~ 1+ yr? and small 
r, the curvature at the origin is given by R = y. 

We could generalize the preceding discussion to D = 3 for ds? = f (r)dr? + r2d6* + r? sin? Ody’. Fill space 
with spheres, with the surface area of each sphere given by 47? and radius depending on f(r). We will study 
metrics of this form when we encounter black holes. 
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Appendix 2: A peculiar description of flat space 


Here, at this early stage in this text, I give you a particularly fun description of flat space, which we will need 
much later in part VII, when we study rotating black holes. Spherical coordinates are so nice that normally hardly 
anybody would think of mucking around with them, but let’s do precisely that and write x = f(r) sin@ cosy, y = 
f(r) sin @ sing, z=r cos 6. Start with Pythagoras and describe flat space with ds? = dx? + dy? + dz’. Plug in 
dx = f'(r) sin @ cos gdr + f(r) cos @ cos ydé — f(r) sin @ sin gdg, and so on, to obtain 


ds? = dx* + dy* + dz? 
= (f” sin? 6 + cos? @)dr? + (f” cos? 6 + r? sin” 6)d0? + f? sin? Ody? 
+2(ff' —r) sin @ cos Odrd6 (24) 


You might have recognized that we have merely worked out various versions of (12), (13), and (14) for a specific 
example. 

A couple of lessons here. For some arbitrary f(r), (24) is a metric that nobody would love, or even want to 
calculate with. But by construction, it certainly describes flat space. While we are used to diagonal g,,,,, we see that 
an off-diagonal term drd6 appears easily. To get rid of this diagonal term, set f f’ =r, whichimplies f? =r? + a?. 
For a = 0, we recover the usual spherical coordinates, but perhaps surprisingly, we find that for a 4 0, we could 
describe flat space with what are known as Boyer-Lindquist coordinates: 


r? +a? cos*O | 4 
. 


ds* = 
r2 + a2 


+ (r? +a cos” 6)d6* + (r? +.a*) sin? Ody* (25) 


Instead of spheres, the surfaces of fixed r are now ellipsoids with cylindrical symmetry. Strangely, at least for 
those seeing this for the first time, r = 0 is not a point! Instead, it is a disk (a flattened ellipsoid) with radius a. 
Set r = 0 to obtain x =asin@ cosy, y=asin@ sing, z = 0, and we see that as @ and 9 vary, this sweeps out a 
disk. In fact, @ plays the role of a “radial” rather than an “angular” variable: as @ goes from 0 to 7, we go from 
the center of the disk to its edge. In other words, (r, 0) = (0, 2/2) describes a ring with radius a. 

After examples like this, I hope that you will find black holes a little less strange when you eventually 
encounter them. 


Appendix 3: Divergence, Laplacian, and all the rest 


When I was an undergrad, to do the exercises in a course on electromagnetism, for example, I had to know the 
form of the divergence, the Laplacian, and things like that in various coordinate systems. Since I was taught to 
derive things rather than to look them up or to memorize them, I must have derived the Laplacian in spherical 
coordinates about a hundred times. It got to be kind of annoying. No doubt many readers have had the same 
experience. 

I now show you that the high-powered metric formalism you just learned provides an easy way to derive all 
these objects like the Laplacian directly from the metric g,,,. Of course, they could all be derived also by brute 
force substitution (as I painfully recall). 

First, we will determine the divergence of a vector field W(x). The simple minded divergence that works 
in Cartesian coordinates 0,,W"(x) = DEAL awe does not work in general, because it does not transform like a 
scalar (as you can verify; also see below). We want a divergence that does not depend on the coordinate system, 
that is, one that transforms like a scalar. 

The clever trick is to invoke the integral J = f d 7 x./gW"(x)d,,(x). We learned earlier that the integrand 
W#(x)d,,6(x) isa scalar, and we know from (23) that d Px /gisalsoascalar. Hence Z is invariant under coordinate 
transformations. Integrate by parts to find that J = — f dP xu (/BW")p =-f @xJ8 On./gw')d. We 


deduce from the coordinate invariance of Z and d? x,/g that the combination 


1 1 
DW" = Feuer) =3,W" + (2,08) we (26) 


must be a scalar. At this stage, D,, W“ is just a convenient symbol; you won't learn what D,, means until a later 
chapter. What you have learned here is that the Cartesian divergence 4,,W” must be corrected by adding an 
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extra term. Of course, in Cartesian coordinates, g = 1 and the extra term vanishes. But in spherical coordinates, 
./g =r* sin 6, as noted before, so that the correct expression for the divergence is ,W" + dgW® + a,W? + 
2wr + cost W®, as you may or may not recall. The point is that (26) gives the divergence in any coordinates and 
any curved space. 

Next, to the Laplacian. Here we are after a general result, independent of specific coordinate systems. The trick 
is to consider the integral [ dx ./gg""d,,¢0,o, which is manifestly a scalar, since the integrand is a scalar (since 
g"” and 4, transform with S and S~1, respectively). Integrate by parts to obtain — f d? xg, (/8 a" d,) = 


—f dx. /8b( Fe )8y (/gg"”d,@). We thus conclude that the Laplacian is given by 


D¢= 77 Iles” d,o) (27) 


In Cartesian coordinates, this reduces to the usual expression. But, for example, in spherical coordinates, 


plugging in //g and g“”, we obtain the Laplacian (a? + 28, 4 5 ap 4 see dg 4 ao, again, as you may 


1 
r2 sin? 6 
or may not recall. 

My pedagogical strategy is to go from change of coordinates to curved spaces, from which it is a short hop over 
to the curved spacetimes we need for Einstein gravity. You learn here that it pays to learn this general formalism, 
even if you don’t intend to deal with curved spacetimes any time soon. 


Exercises 


1 Suppose we defined ds? = g,,,(x)dx"dx" as the square of the distance from x to x + dx. One consistency 
requirement is that this is the same as the square of the distance from x + dx tox. Show that this requirement 
is satisfied. 


2 A race of Eskimo mites living around the north pole naturally uses the north pole as the origin of their 
coordinate system and the “walking” distance from the north pole as a distance measure. After years of 
study, their mite geometers figured out that the metric of their world is given to second order by (set the 
radius of the sphere to 1) 


2 2 
2 
ast=(1 eas (1 dy? + Say dxdy +o 


For x, y <1, the space is flat and as Euclidean as it could be. But note that in second order the metric is not 
diagonal. 

You of course know that their coordinates (x, y) are related to the usual spherical coordinates by x = 
6 cosyand y =@ sin g. Furthermore, you know, but Eskimo mites don’t, that actually ds? = d6? + sin? 6dg?. 
Given your knowledge, derive the metric above. 


3 The familiar Mercator map of the world is obtained by transforming spherical coordinates 0, y to coordinates 
x, y given by x = #g, y = —# log tan §. (This was first derived by the English mathematician Edward 
Wright in 1599.) Show that ds* = Q?(x, y)(dx* + dy?). Determine Q. 


4 Consider the space described by 


of 


ds? = 
peta 


dr? 4 p?de* + (r? +a) sin” 6dy’, where p? = r2 + a* cos? 6 


Is the space curved or not? (Hint: r = 0 does not represent a point. Also, study lines of fixed 6 and ¢.) 


5 Consider the metric ds? = dr? + (rh(r))*d6?, with 6 and 6 + 2m identified. For h(r) = 1, this is flat space. 
Let h(0) = 1. Show that the curvature at the origin is positive or negative according to whether h(r) starts to 
turn downward or upward. Calculate the curvature for h(r) = sinr and for h(r) = ake 


6 Consider a “fixed latitude” circle defined by 6 = € around the north pole of a unit sphere. The radius is 6, 
while the circumference is 27 sin 9. Show that R = 1. 
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Show that d?x,/g gives the correct length element on the unit circle. 


Here is a simple way of understanding that d*x /g gives the volume element. Consider the metric ds? = 
a2dx* + b?d ry? + c*dz?. Calculate the volume of an infinitesimal rectilinear container with sides dx, d 'y, dz. 


In several areas of theoretical physics, we need to talk about higher dimensional spheres. The d-dimensional 
unit sphere S¢ is embedded into E4*! by the usual Pythagorean relation (X!)? + (X?)* 4+ ---+ (X@)?=1. 
(We will discuss embedding more in the next chapter. See also exercise 16 below.) Thus, S? is the circle and 
S? the sphere. Indeed, we may even live on S>. You can readily generalize what you know about polar and 
spherical coordinates to higher dimensions by defining 


X!=cos6,, X? = sin 6, COs 63, 


xa 


X¢=sin 0, +++ sin 07_1 Cos 67, = sin 6,---sin6,_1sin 07 


where 0 < 6; <  for1<i <d —1, but0 <0, < 2m. (For S, 0, = 6 and @, = g. Note that the usual Cartesian 
coordinates are trivially permuted from the coordinates used here: X!'= Z, X* = X, X3=Y.) Verify the 
Pythagorean relation. Show that the metric on S¢ works out to be 


d+1 
ds, =) (dX')’ = d6? + sin? 6,d0; + +--+ sin’ 0, --- sin? 64_1d07 
i=1 


Show that the metrics on the unit spheres satisfy the iterative relation 
2 992 + cin? Od52 
ds, =d0° + sin’ Ods7_, 


Note that the common observation that curves of fixed latitude on the globe are circles is an example of this 
relation. 


Use the formalism in the text to calculate the area (used in the generalized sense, of course) of S“. Verify 
that you recover what you know for d = 2 and 3. Show that the area of S? is equal to 27”, a result that you 
will need in quantum field theory.? 


A squashed sphere: consider ds? = (b? + a? cos? 0)d0? + wt sin” 6dg*, withO <9 <a and0<gy< 


2x, as usual. Find the length of the equator (that is, the curve defined by 6 = 2/2) and of the circle of fixed 
longitude going through the two poles (for this, give the numerical result when b = a), and the area of this 
squashed sphere. We will need these results later when we discuss rotating black holes. 


Stereographic projection of the sphere: start out with ds? = oy + r?dg? fora sphere of radius L. (This is of 
12 

course the usual spherical coordinates with L sin 6 disguised as r.) Now imagine the sphere as a transparent 

globe with the south pole touching a plane, as shown in figure 3, and a light affixed to the north pole. The 

shadows of various points cast on the plane defines the stereographic map. Show that the point (r, ¢) is 

mapped to the point (p, g) on the plane, with 


p 


= 2 
1+ 


r 


and by substitution 


1 
ds? = 9 dp? + p2dy’) 
(1 + £) 
For S¢ the projection generalizes to ds* = —n (dp? + p'dQ7_,). 


cae 
(4 a 


Conformally flat: a space described by ds* = 2? (x)dsp., is said to be conformally flat. We also say that the 
metric g,,, related to the flat metric by g,,,(x) = Q(x) gia is conformally flat. The metric in exercise 13 
provides an example. Show that lengths are not the same but that angles are the same as calculated with the 


5 


16 


7 


18 
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Light 


2L 


O p 


Figure 3 Stereographic projection of the sphere. 


two different metrics. This is clear for the stereographic projection of the sphere. Note, for example, that as 
p — ov, a given Ap corresponds to an ever-smaller As, but as p — 0, the two metrics approach each other. 
Bad terminology alert: A conformally flat metric does not describe a flat space, as the example of the sphere 
makes clear! More generally, two metrics are said to be conformally related to each other if there exists a 
function Q(x) such that g,,,(x) = 22(x) 8yy(x). Again, it is worth emphasizing that the two metrics are not 
related by a coordinate transformation. 


Verify that the expression given for D,, W“ in (26) reproduces the usual formula for the divergence in spherical 
coordinates. 


Find the metric on the torus. 
Show that any d = 2 space is conformally flat. 


Show that ds? = e*" (du? + dv?) is not only conformally flat but literally flat. 


Notes 


1. 


2. 


One situation that requires more care is in the discussion of magnetic monopoles. See, for example, QFT 
Nut, chapter IV.4. 


QFT Nut, p. 273. For a rather nifty trick giving the area of S¢ for any d, see QFT Nut, p. 539. 


Curved Spaces: Gauss and Riemann 


Fear of curves 


Surely you have heard that Einstein gravity involves curved spacetime. In my experience, 
the very mention of the Riemann curvature tensor, which we will come to in due time, 
inspires fear and trembling in many beginning students of general relativity. “I was doing 
well in the course,” students would moan, “until we started doing Riemannian geometry!” 
I actually have a great deal of sympathy for these students: very little in their background 
prepares them for the Riemann curvature tensor. Unless they have taken an exceptionally 
good mechanics course, with studies of particle motion on curved surfaces, they typically 
have no prior exposure to the massive amount of rather technical material needed to 
calculate the Riemann curvature tensor. Dear reader, take heart: even Einstein had to 
struggle to master differential geometry. 

Given this frightening reputation of Riemannian geometry, we will take baby steps 
toward curvature. My pedagogical strategy is to first deal with surfaces that you can 
practically imagine holding in your hands. 


Introducing Professor Flat 


As I was encouraging a fearful student, Professor Flat came ambling along. A mild man- 
nered man with a somewhat disheveled look, he inquired gently, “Why all the trembling?” 
We told him, and he nodded. Then he asked the fearful student, “Do you know why it took 
humans so long to realize that the world was round?” 

FS: “Because the world is locally flat, and humans can’t walk very fast.” 

PF: “Exactly! In everyday life, we have no need to know that the world is actually round, 
as long as the distance scale of interest is small compared to the earth’s radius. If you 
focus on a small enough region on any surface—or any space in any dimension for that 
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matter—it’s going to look flat, and you could simply study the small deviation from flat 
space. Nothing terribly frightening at all!” 


Tangent plane 


So indeed, let’s follow Professor Flat’s advice. To get some feel for curvature, let us keep 
things simple and be content to take some nice curved space, roundish like the sphere. 
Imagine approaching it with a flat plane until it touches the surface at some point P. The 
plane is then known as the tangent plane. To focus our minds, consider a sphere of radius 
L. Since all points on it are equivalent, we might as well have the plane touch the south 
pole and be perpendicular to the z-axis joining the north and south poles. The southern 
hemisphere is defined by z = —/L? — x2 — y2 + L, where we added the constant L to 
the usual definition of z so that for convenience, z = 0 at the point P we are interested in 
(the south pole in this case). Near the south pole, z ~ 3; (x? + y2): the sphere is locally 
a parabolic bowl and is well approximated by the tangent plane. It is in this sense that 
Professor Flat says that the surface is locally flat. 

So, consider the tangent plane touching the roundish surface we are studying at some 
point P. See figure 1. Let z denote the coordinate perpendicular to the plane, and (x, y) 
the coordinates in the plane, with the point P coordinatized by (x, y) = (0, 0). Locally, we 
have a quadratic expansion in the small quantities x and y: z= jax? +exy+ 5by?. Here 
a, b, c have dimension of inverse length, and thus the local region is defined by x, y small 
compared to a~!, b~!, c~!. (For the sphere, c = 0 and a = b = 1/L.) Applying Pythagoras, 
we obtain the distance squared between two nearby points with coordinates (x, y) and 
(x +. dx, y+ dy): 


ds? = dx? + dy? +.dz* =dx* +dy* + [(ax + cy)dx + (by + cx)dyl 
= g,,dx? + gyydy” + 2g,,dxdy (1) 


with the metric 
8xx = 14 (ax + cy)’, By = 14 (by +.cx)*, By = (ax + cy)(by + cx) (2) 


Note that even for the sphere, the metric regarded as a 2-by-2 matrix contains an off 
diagonal term in this locally flat coordinate system. Of course, for x, y — 0, the off-diagonal 
term vanishes and the metric approaches the identity matrix. 


Figure 1 The tangent plane to a 
curved surface. 
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‘ = > : ac 3 
We could also write z ~ 5x7 Mx, with M = ( ) and x = (x, y) a column vector 
c 


b 
which we are writing as a row vector for typographical reasons mentioned in chapter 


1.3. But we can always rotate the coordinates x = (x, y) to some other coordinates x’ = 
(u, v) given by x’ = Rx. We want the curvature to be an invariant geometric concept not 
dependent on our coordinate choices. In the new coordinates, M is replaced by M’ = 
R—'!MR. We know that M possesses two invariant quantities, namely its two eigenvalues 
wand v. But this dovetails perfectly with our discussion in the prologue. 

Remember the ant going for her honey? We learned from that story that at a given point, 
a curved space has an intrinsic and an extrinsic curvature. It all makes sense, then: these 
two curvatures correspond to the two invariant quantities contained in M. As you will 
see, we could exploit similar reasoning, asking how things transform under a change of 
coordinates, to determine the curvature of Riemannian spaces of any dimension. 

Let us diagonalize M, so that in the new coordinate basis* z = 5 pur + 5vu. Thus, 
we have the intuitive result that any surface is locally the sum of two parabolas, z = 


land v7! the radius of curvature of the parabola in the u and in 


5 bu? + sv, with p— 
the v direction, respectively. (Expand a circle of radius L around some point, in the same 
way that we expanded a sphere earlier, y= —/L*—x*+L~ x, and we see that it is 


locally a parabola with radius of curvature equal to L.) 


Intrinsic versus extrinsic 


For a 2-by-2 matrix, its determinant and trace, or equivalently, its two eigenvalues, con- 
stitute its two basis-independent attributes. Our insistence that the measure of surface 
curvature does not depend on whether we use the x, y or the u, v coordinates means that 
we can have two measures of surface curvature: 


Intrinsic curvature = det M = wv = ab — c? (3) 


Extrinsic curvature = (4trm)* = t(u + vy? = i(a + by)? (4) 


(Note that I have defined the extrinsic curvature to have the same dimension as the extrinsic 
curvature. The normalization factor i is such that the intrinsic and extrinsic curvatures of 
the sphere, for which a = b, c = 0, are equal.) 

How do I know which is which, intrinsic versus extrinsic curvature? We appeal to the 
cylinder in the prologue as an example, knowing that it is intrinsically flat. Indeed, for 
a cylinder’ of radius L, z = —/L? — x? + L independent of y and hence b= c = 0. We 
have intrinsic curvature = 0 and extrinsic curvature = a” = 1/L”, as we would expect. In 


* Note that while M is diagonal, the metric is not. 

+ The cylinder example also underlines the fact that we are always talking about local curvature, as explained 
in the preceding chapter. Globally, the mites could of course go all the way round and come back to the same 
place. 
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Einstein gravity, we are normally only interested in the intrinsic curvature of spacetime, 
since we can’t get out* of spacetime and look at its extrinsic curvature. 


Negative curvature 


A sphere of radius L has intrinsic and extrinsic curvatures both equal to 1/L*. Since every 
point on the sphere is the same as the south pole, the sphere has constant intrinsic and 
extrinsic curvatures. 

A beginning student named Confusio! looked, well, confused. He said: “You mean the 
curvature at the bottom of a bowl and at the top of a dome are not opposite in sign? 
Everybody knows that the bowl is curved upward, the dome downward.” 

Confusio’s everyday intuition caused him to think that he was somehow inside the 
bowl and outside the dome. In fact, the sphere is an infinitely thin surface. There is no 
conceptual distinction between being on the “inside” and the “outside” of the surface. Of 
course, we could also turn the bowl upside down. Let’s calculate the curvature at “the top 
of the dome,” as Confusio calls it, otherwise known as the north pole. Near the north pole, 


z=VJ/L?2—x*-y*-L~ sh (x? + y), so that a =b= -t, c = 0. Thus, the intrinsic 
curvature is ab — c* = ee as expected. Compared to the neighborhood around the south 


pole, z flips sign, but dz? does not, so that the metric, and hence the curvature, have the 
same value at the north pole and at the south pole, as they should. 

Thus, in contrast to everyday perception, a negative curvature surface? is one in which 
the two parabolas bend in opposite directions, 4. = —v, such as the proverbial saddle. A 
more contemporary example is the kind of potato chips that come in a cylindrical container. 
Iam surmising that the typical reader of this text is more likely to eat potato chips than 
to gallop across the steppes with the Golden Horde. An example is a surface that goes 
like z= xy forx ~0, yx 0. Thena =b=0,c=1, and pw = 1, v = —1, and it has negative 
curvature R = —1 at the origin. 

Professor Flat: “Confusio, you still look confused. Think of a donut or a bagel, otherwise 
known as the torus. Do you see that along its hole, the curvature is negative? Along the 
outer edge, in contrast, the curvature is positive.” 

Incidentally, it then follows that somewhere on the torus the curvature (we are always 
talking about intrinsic curvature) vanishes. Do you see where? You could always use the 
tangent space method here to calculate the curvature and thus check your intuition. 


Embedding of curved spaces in higher dimensional flat spaces 
In general, we could embed? a D-dimensional space in N-dimensional Euclidean space 


EN. Write X4(x1,---, x?) for A=1,---, N. For everyday surfaces, D = 2 and N = 3, but 
the formalism to be given presently works for arbitrary values of N > D. (For the iconic 


* Except in some speculative and unproven theories! 
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sphere, (x!, x?) = (6, g) and (X!, X*, X3) = (X,Y, Z) =(sin@ cos, sin @ sin g, cos 0).) 
We see that x represents coordinates on the surface: as x varies, the points X4(x) sweep 
out a D-dimensional space in E%. In this age of spectacular computer graphics, you can 
easily generate all kinds of interesting surfaces by playing around with three functions 
X4(x) of two variables. 

Consider two neighboring points: P described by x” and Q described by x“ + dx". If 
the Euclidean coordinates of P are given by X4, then the Euclidean coordinates of Q are 
given by X4+dX4= X44 ax" dx", Pythagoras gives us the distance squared between 
PandQ 


2_ Ay2 ax4 pone v_ May 
ds’ =) \(dx‘P=)° pn Ot dt" = Bd (5) 
A A 


with the metric 


(6) 


Here I choose to display the summation over A explicitly. The attentive reader wlll realize 
that (1) and (2) are examples of these two equations. The metric g,,, is said to be induced 
by the ambient Euclidean metric. 

Another common embedding method is to restrict X“ to satisfy certain conditions. 
Again, we have the familiar example of the unit sphere defined by X? + ¥2 + Z? =1. 
Indeed, writing (X, Y, Z) in terms of (6, gy) amounts to solving this equation. Instead, 
we could choose to eliminate* Z. In the notation used above, (x!, x2) = (X, Y). Then, 
ds? = dX? + dY? +dZ? =dX* + dY? + axsvayy We can clearly exploit the rotational 
invariance in the (X-Y) plane and writeé X =r cosy, Y=r sing, X*+Y*=r?, and 
XdX + YdY =rdr, thus obtaining 
r2dr? dr? 


ds? =dr* + r°dg* + = 
¢ 1—-r2 1-—r2 


+ r7dy? (7) 


Since the metric in (7) and the usual ds* = d6? + sin? dg? both describe the sphere, they 
must be related by a coordinate transformation. Do you see how? Another question for you: 
Why does (7) become singular as r > 1? See appendix 1. Further examples of embedding 
are given in appendix 2. 


Locally flat 


The fearful student looks much more relaxed. He asks, “All this is clear enough for actual 
2-dimensional surfaces I can visualize, but is it obvious that we can always choose our 
neighborhood to be locally flat for any space of any dimension D?” 


* The two solutions of the quadratic equation for Z define two coordinate patches covering the northern and 
southern hemispheres (minus the equator, strictly speaking), as mentioned in the preceding chapter. 
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Professor Flat: “It is fairly obvious for any sufficiently smooth? space. Look at how the 
metric transforms (1.5.14) when you go to a new set of coordinates: 
ax" ox” 


Ox! Ox/e 


Bq (X') = Buy) (8) 


Within reason, you could choose any x’ you want, and for each choice, you get a new form 
for the metric. You have a lot of freedom to massage the metric into the form you want. 
The proof simply amounts to counting how much freedom you have on hand.” 

So, look at our space around a point P. First, for writing convenience, shift our coor- 
dinates so that the point P is labeled by x = 0. Expand the given metric around P out 
to second order: g,)(%) = Sy» (0) + Apy,,X* + Byyack*x® +--+. (The commas in the 
subscripts carried by A and B are purely for notational clarity, to separate two sets of 
indices.) 

Again, let me assure the abecedarians that nothing profound is going on. We are merely 
expanding g,,,(x) in a power series, with the coefficients given names A,,,, and Buy ro 
(with indices that accord with the rule of repeated summation having to involve an upper 
and a lower index). Note that the lower indices on the left hand side remain lower indices 
on the right hand side, and similarly for upper indices. 

As always, if you get confused, you should simply refer to the sphere. Thus, let the 
coordinates of the point P be (@,, @,), so that x! = (6 — 6,), x? = (g — @,). (Of course, in 
this simple case, nothing depends on 9,.) What we just wrote down is then simply, for 
example, gy, = sin? 6 = sin? 0, + 2 sin 0, cos 6,x!'+---, so that Agy,1 = 2 sin 4, cos 6,. 
and A, 7 = 0. Nothing profound at all. 

Change coordinates according to x = K#x’/”+ LY xx/*4+ MM, xxx 4.0-, 
Again, nothing profound: K, L, M, --- are just a bunch of coefficients to be determined. 
At the point P, the new metric is given by (8): 


Big (0) = By, (0) KE KY (9) 


Regard this as a matrix equation g’ = K" gK, where T denotes transpose. Since g,,,,(0) is 
symmetric and real, there always exists a matrix K that will diagonalize it. After g,,,,(0) 
becomes diagonal (with positive diagonal elements—we will take that as a definition of 
space), we could scale each coordinate, one by one, by an appropriate factor,* so that the 
uv: With 6.) 
equal to 1 if ~ = v and 0 otherwise. (We shall keep dropping primes as we move along, 


diagonal elements become 1. We end up with the Euclidean metric g,,,(0) = 6 


renaming the new coordinates and metric x“ and g,,,,, respectively.) 


Let us count 


The object K“ has D? elements to start with. How many are left, now that we have fixed 
Suv (0)? 


*x# — xF/,/g,,,,(0), no sum over repeated indices here. 
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Let’s count. We are in D-dimensional space. First, note that the number of independent 
elements in an antisymmetric D-by-D matrix F,,,, = —F,,,, is equal to 5D(D — 1), since, 
for each of the D values the first index can take on, the second index can take on only D — 1 
values.* In contrast, a symmetric D-by-D matrix has 5D(D — 1) off-diagonal elements and 
D diagonal elements, making for a total of 5D(D + 1) elements.* 

Since g,,,(0) had 5D(D + 1) arbitrary elements to begin with, we had to use this many 
elements in K* to adjust these to 4,,,. Hence, the object K“% has D* — 5D(D +1)= 
5D(D — 1) elements left over. This is exactly the number of independent elements in an 
antisymmetric D-by-D matrix. Hardly an accident! As discussed in detail in chapters 1.3 
and 1.4, the number of generators in the rotation group SO (D) relevant for D-dimensional 
space is  D(D — 1). (For instance, for D = 3, 3 D(D — 1) =3, and we have precisely three 
rotations that leave the identity matrix 5,,,, invariant.) We have just shown the fairly obvious 
fact that in D-dimensional space, the freedom we have left in K is precisely the freedom 
to rotate. 

Now onward. We proceed to the next step and claim that the linear terms in g,,,,(x) = 
Suv + Ayv,aX* ++ ++ can be removed by suitable choices of L“,, in x = x/# + Lh xx + 
-+ +, (Evidently, A,,,,, and L", have been modified already by what we have done thus far, 
but we do not want to introduce more letters.) 

I urge the reader to expand (8) to first order in x to see what is going on. Since A,,, , 
is symmetric in jv, it contains ;D(D +1)D= 5D*(D + 1) elements (18 for D = 3). But 
L* also has 5 D*(D + 1) elements: like A uv,ait has three indices and is symmetric in two 
of them. Yes! We have enough Ls to knock off® the As. Thus, locally around any point P in 
a Riemannian manifold, the metric can always be chosen to be g,,,(x) = 6,,, plus second 
order terms. 

Thus, at any point P in a Riemannian manifold, not only can we choose the metric to be 
Euclidean, but we can also arrange for the first order deviations from Euclidean to vanish. 
Indeed, in our simple example (2), the deviation from locally Euclidean is quadratic. (See 
also exercise 1.5.2.) That the corrections to the locally flat Euclidean metric are second 
order, rather than first order, is the mathematical explanation for why humans thought 
their world was flat for so long. 


Curvature 


Let’s keep going! At this stage, we have g,,(x) = 8yy + BuyrgX x +++ and x! = x/4 + 


MM", .x'¥x/*x'? 4+ +++. How many components of B can we knock off by judicious choices 


of mM", > 


vio* 


* Recall that we did this sort of counting in chapter 1.4. 

T Alternatively, knowing thatn(D), the number of independent elements in a symmetric D-by-D matrix, can be 
at most quadratic in D, we write n(D) = cy + c,D + c)D? and fix the coefficients instantly from n(0) = 0, n(1) = 1, 
and n(2) = 3. We obtain n(D) = ;D(D + 1). A similar argument gives the number of independent elements in 
an antisymmetric D-by-D matrix. 
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Let’s count. The number of components in B,,,,, is easy to count, since B has two 
pairs of symmetric indices, and so we have ( 5D(D + 1))? components. The number of 
components in M“’, is harder to count. Focus on the symmetric triplets vo. We know 
that the number f(D) of possible choices is cubic in D. Again, the quickest method, 
efficient though not particularly clever, is to write f(D) as a cubic polynomial in D and 
then determine the coefficients by “fitting to data.” Start with f(1) = 1. To find f(2), 
simply exploit the symmetry to arrange the indices in the order v > A > 0, and list the 
possibilities: 222, 221, 211, 111, and so f (2) = 4. Similarly f (3) = 10. (Also, the process is 
clearly inductive: f(D) = f(D —1)+ ;D(D + 1).) In this way, we obtain f(D) = ~D(D + 
1)(D + 2), and hence M“, |. has 2 D?(D + 1)(D + 2) components. We use these to knock 
off some components in B, leaving } D*(D + 1)? — 4D?(D + 1)(D + 2) = 4D?(D? — 1) 
elements that we can’t get rid of. 

If you didn’t quite get all that, just write it out for D = 2, and you will see what’s 
going on. At this stage of the cancellation game, we have 3 coefficients a, b, c in 93, = 
14+ a(x)? + b(x?)? + c(xtx*) +--+. Similarly for go) and gy, for a total of 3x 3=9 
coefficients we want to cancel. On the other side of the ledger, we can adjust 4 parameters 
p.g.r, sinxt =x" + p(x) + gx? + rx lx)? + s(x)? + - - +. Similarly for x* for 
a total of 2 x 4 = 8 parameters we can adjust to knock off the 9 coefficients in the metric. 
So we are left with 9 —-8=1= B22(2” — 1) number we can’t get rid of. 

As another example, for D = 4, B has 100 components, and M 80 components, leaving 
us with 100 — 80 = 20 = 34°(4? — 1) numbers we can’t get rid of. 

The measure of curvature is what we can’t iron flat. We thus conclude that at any given 
point on a Riemannian manifold, we need Riemann(D) = 7, D?(D? — 1) numbers to 
specify the curvature. In particular, Riemann(1) = 0 and Riemann(2) = 1, confirming* 
what we already know. The number Riemann(D) increases rapidly: Riemann(3) = 6, and 
for D = 4, which, I’m sure you've heard is relevant for Einstein gravity, the curvature has 
Riemann(4) = 20 components, a number that sets our student FS to fear and trembling 
again. It is of course reasonable that it takes a lot of numbers to describe curvature in 
higher dimensional space, since we have to specify how the space is curving in many 
different directions. 

To make sure that you follow this discussion, I suggest you try this fun exercise. Suppose 
you were given a space described by the metric ds? = dr? + r7d6?. This is of course a 
plane as flat as Kansas, but suppose you didn’t know that. Calculate the curvature by first 
transforming polar coordinates into locally flat coordinates! at the point (r, 0) = (r,, 0) by 
going through all the steps here. Then extract the combination of the B,,,, ,,8 giving the 
intrinsic curvature. By the end of this straightforward exercise, you will probably agree that 
there ought to be a better way to get at the curvature. 


* Note that a curve has no intrinsic curvature, only extrinsic curvature, as is intuitively clear, while a surface, 
as described in chapter I.5, is characterized by two numbers specifying the intrinsic and extrinsic curvatures. We 
are evidently talking about the intrinsic curvature here. 

T Also known as Riemann normal coordinates. Thanks, Jargon Guy. 
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It is commonly said in academia that the best way to master a subject is to teach it. In 
this connection, the computer is a more-than-willing student. Write a program such that, 
given a metric, the program is able to find the locally flat coordinates at an arbitrary point. 
If you do this, as I did while writing this chapter, you will truly understand the counting 
above. 


Guessing what the Riemann curvature must look like 


One significant by-product of this counting and subsequent understanding of local flatness 
is that we see what the general expression for curvature must involve. Since the curvature at 


the point P is described by those components in B that we cannot transform away, we 


v,Ao 
conclude* that the definition of curvature must jawolve two powers! of derivatives acting 
on the metric g,,,. As we have already seen, it takes lots of numbers to describe curvature 
completely. But we also know that we could change coordinates (from (x, y) to (u, v), for 
the simple curved surface that started this chapter). Thus, we don’t want these numbers to 
gallop wildly out of our control when we change coordinates. Now you see that the concept 
of a tensor, as discussed in chapter 1.4, is going to play a big role. In fact, the 4 D*(D? — 1) 
numbers are the components of a tensor, quite naturally called the Riemann curvature 
tensor. With our intuitive discussion here, we can anticipate that the curvature tensor will 
have the schematic form R.... ~ 00g..; since the number of components grows quartically 
with D, we can even guess that it will carry 4 indices. Very nice: this is all consistent with 
the curvature being related to By. 40- 


What did Riemann want? 


The great insight of Carl Friedrich Gauss and other pioneers of differential geometry, 
not to mention the mite professors of geometry, is that given the metric, we should be 
able to determine the intrinsic curvature without worrying how the surface is embedded, 
as was already explained in the preceding chapter. Knowing the distance between two 
infinitesimally separated points, we can find the distance between two points far apart by 
integrating ds along any path connecting the two points. We can then define a “straight 
line” between two points as that path that minimizes the distance between them. This 
allows us to do geometry as Euclid had shown us. Shades of Newton, Leibniz, and Lie! 

Indeed, we defined the Poincaré half plane by specifying ds”, without having to say, or 
even to care about, how it is embedded. 

To calculate the intrinsic curvature, we need to know only the metric g,,,; we don’t need 
to know the embedding functions X4(x"). When the great Gauss discovered this fact for 
curved surfaces in 1828, he was so struck by it that he called it the Theorema Egregium 


* Since B,,, ,¢ are nothing but the second order Taylor coefficients in an expansion of g,,, around the point P. 
¥ This is also confirmed by the intuitive example in (1) and (2). 
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(the outstanding or extraordinary theorem; the meaning of the Latin word has been much 
distorted in the English word “egregious”). Here’s hoping that you find your Theorema 
Egregium some day. 

The whole point of the story in the prologue is that, like the mites, we cannot get out of 
the universe, yet we can measure its curvature. 

Bernhard Riemann, who was two years old when the Theorema Egregium was born and 
whoas a student attended Gauss’s lectures, took the profound step of extending differential 
geometry to arbitrary dimensions and definitively taught us how to calculate curvature. 
Given a metric g,,,, Riemann wanted the curvature. 

Now that I have sketched it out for you, you could set yourself a challenge and see 
how you stack up compared to Riemann. The problem is easily stated and well posed. 
Construct a tensor R.,.. ~ 00g. out of two partial derivatives and the metric that would 
measure intrinsic curvature. See how far you can get before reading further. 

You will soon see that it is not so simple. Indeed, even for D = 2, looking at (3), and 
knowing the answer, it is not obvious what the combination should be. In that simplest 
of all cases, the Riemann curvature tensor has only one component and thus degenerates 
into a scalar. So given 3 functions g11, g22, and gy, call them E, F, G say, each a function 
of two variables x, y, find an expression involving E, F, G, allowing yourself only two 
derivatives, such that the expression does not change under the transformation in (8). You 
recognize that this amounts to Gauss’s problem. Challenge yourself! 

A historical curiosity. After Riemann worked out the general treatment of curved spaces, 
he had, remarkably enough, some vague thoughts about curved spaces having something 
to do with gravity. Unfortunately for him, he was way too early. Special relativity and 
Minkowsk/’s unification of space and time into spacetime (as we will see in part III) were 
still in the future. We now know that it is curved spacetime, not curved space, that has 
something to do with gravity (as we will see in part IV). 


Appendix 1: Coordinate singularity, a simple version of the Einstein-Rosen 
bridge, and a wormhole 


The metric discussed in the text ds? = _ + r2dq? illustrates an important point. Here are the answers to the 
questions I asked you. Set r = sin 0, so that dr? = cos? 6d6? and ds? becomes d6? + sin” Ody”. The singularity at 
r = Lis merely due to our choice of coordinates going bad at the equator. In fact, the sphere is perfectly smooth 
there. 

When we study black holes in parts VI and VII, we will encounter this kind of singularity, known as a coordinate 
singularity, in contrast to an actual or physical singularity, when the geometry itself becomes singular. As another 
example, consider the surface described by 


ds* = ae + rdy* (10) 


with rs a positive constant. Again, the singularity at r = rs is merely a coordinate singularity. Indeed, this 

surface could be embedded into E? using the familiar cylindrical coordinates ds* = dr? + r2dy* + dz? and setting 
2 

z? = 4rs(r — rs). (Let’s verify this: zdz = 2rsdr, so that dr? + dz? = (1 } 3 )ar? = in dr?.) Thus, the surface is 

perfectly smooth at r = rs (see figure 2). It consists of two planes connected by a “throat,” known as the Einstein- 

Rosen bridge. Note that a mite geometer could perfectly well travel from the upper to the lower plane without 

noticing any singularity at all. So this type of geometry is sometimes known picturesquely as a wormhole. 
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Figure 2 The Einstein-Rosen bridge, underlining 
the difference between a coordinate singularity 
and a physical singularity. 


A line of fixed r is simply a circle. For r >> rg, the space becomes flat: ds* > dr? + r?dq?. Incidentally, in the 
same cylindrical set-up, the S* metric we started this appendix with is given by the embedding z* = 1— r?. In 


fact, | am repeating myself. 
Meanwhile, Confusio has flown off to the equator to investigate the r = 1 singularity there. “What singularity?” 


the locals ask him. 


Appendix 2: Spheres 


In the text, we studied the 2-dimensional sphere S?., Clearly, it is not difficult to generalize our discussion to 
higher dimensional spheres. For instance, S3 is embedded in E* by X24 24 774 W2=1. Replace (X, Y, Z) 
by the usual spherical coordinates, so that W2=1-r2 and WdW =-—rdr, and eliminate W in ds? =dX* + 


d¥? + dZ* + dW* = dr? + r°(d6* + sin? 6dg?) + cra We thus obtain the metric for S*: 


dr? 
1—r? 


2 
< +r?(d6? + sin? 6dy?) = 
-—r 


ds? = + r7do3 (11) 


The only difference from (7) is that here the angular element dQ? on S? appears rather than the angular element 
dQ? = dg? on S' (namely the circle). Indeed, recalling what we just learned in the preceding appendix, we invite 
ourselves to write r = sin y in (11), thus obtaining d Q2 = dy? + sin? wdQ2, where, in line with the usual solid 
angle notation, we have renamed the line element ds? on Stas d a2. 

Evidently, we can determine the line element on S$” iteratively, 


dQ? =dy? + sin’ dQ? _, (12) 


thus recovering the result of exercise 1.5.10. As noted in that exercise, this generalizes the elementary school 
observation that the curves of constant latitude on the globe form circles. Here, the subspaces of constant w 


form S"~1. 
I already mentioned in exercise 1.5.9 that we may actually live in S?. In (11), simply scale r > r/L, and 
then multiply ds? by L? to obtain ds* = — + r?(d0? + sin? @dy’). All we have done is to restore the length 
L 
dimension to r. One objective of observational cosmology is to determine L, or failing that, to set a lower bound 
on it. 


Appendix 3: Hyperbolic spaces 


The 2-dimensional hyperbolic space H? (or pseudosphere, as some people call it) is less intuitively accessible than 
S? and hence generally not mentioned in elementary schools. Define a 2-dimensional surface by X* + Y? — W* = 
—1, embedded (crucially) not in £3 but in a “pseudo-Euclidean” space with the metric 


ds? =dX* +. dY* —dWw (13) 
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Once again, replace (X, Y) by the usual polar coordinates, so that W2=1+/7? and WdW =rdr. Writing 
dW2= tear? in ds”, we obtain the metric for H?: 
(rdr)* dr? 


ds? = dr* + r°do? 
1t+r2 1472 


+ r7do? (14) 


Compare and contrast with (7) for S*. That one sign makes all the difference, of course. 

Now we are invited to write r = sinh j, so that ds? = dy? + sinh? yd6?. Hyperbolic sine, hyperbolic space, 
got it? Curves of constant “latitude” (that is, yr) also form circles, while W* = 1 + r? traces out a hyperbola in the 
(W-r) plane. 

Evidently, we can move up in dimension. The 3-dimensional hyperbolic space H? is described by the line 
element (not very imaginative notation) dH} = dy? + sinh w2(d02 + sin? Ody’) = dw? + sinh ydQ3. More 
generally, 


dH? = dw? + sinh y7dQ?_, (15) 


Note that H” is constructed from S"~!, not H”~!. 

Again, one issue addressed by cosmology is whether we live in H? or S?. Just as for spheres, we could restore 
the length dimension to r and write the metric for H3 as ds? = ite + r?dQ3. (Note that (14) looks like but is not 
to be confused with the stereographic metric of a sphere given in exercise 1.5.15.) We will encounter hyperbolic 
spaces when we study de Sitter and anti de Sitter spacetimes in part IX. 

Note that while the embedding space is pseudo-Euclidean, the hyperbolic space is clearly locally Euclidean. 
Indeed, around an arbitrary point on H 3 say r= L, 0 = 7/2, and y = 0, the mites living in the space would 
experience the perfectly Euclidean metric ds? ~ 5dr? + L?(d0* + dg’). Indeed, if you want, you can define 
—— r/V2, y=L0, z=Lg, so that ds? ~ dx? + dy? + dz?. The mites don’t know that the embedding space 
is not Euclidean and couldn't care less. 


Appendix 4: A potential confusion over hyperbolic spaces 


Consider another hyperbolic surface defined by X* + Y* — W? = 1 embedded in a space with the metric ds? = 


dX* + dY¥? — dW’. You should draw the surface defined by X? + ¥Y? — W2=1 and compare it with the cor- 
(rdr)? 
r2-1 


responding surface in appendix 3. Following the same steps as above, you find ds* = dr? + r7d6? 
ar? 

1-r 
Well, I bet that you drew something like a cylinder but with a radius that grew toward infinity at both ends. 

If you didn’t, hurray for you. In all likelihood, you drew the surface as if it were embedded in E?, but it isn’t. 


Indeed, you see that if you analytically continue W — iW, the surface defines S$?. 


+ r?d6?. This is just the sphere in (7). Surprise! (Or were you surprised?) 


Exercises 


1 Find the transformation relating the coordinates used by the Eskimo mites in exercise 1.5.2 to the coordinates 
in (2). 


2 Calculate the curvature of the torus by the tangent plane method. 


3 Acivilization living in 2-dimensional space marks their world with the coordinates (« , ¢) handed down eons 
ago by their ancestors. Careful measurements of distances between various points over time have shown 
that the metric of their world is given by g,, =1+-++, 8¢¢ =1+2« +--+, &¢=O+---, where the dots 
indicate terms quadratic in « and ¢. The civilization is in fact planning to deploy a team of geometers to 
measure these quadratic terms. As they develop physics, they find the linear term in g,, terribly irksome. 
(a) One day, a bright young physics student points out that changing coordinates from («, ¢) to (@, ¢), 

defined by k =@+ 3¢?, ¢ = @¢ — w¢, would cause the linear term in the metric to disappear. Show that 
this is in fact the case. 
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(b) Many years later, another bright young physics student realizes that the “crazy” coordinates («, ¢) are 
just remnants of polar coordinates left by advanced interstellar visitors, who had long since departed. 
The student writes down polar coordinates (r, @) with ds? = dr? + r2d6? and shows that the civilization 
has been flourishing in a small neighborhood of the point P with coordinates (r, 0)p = (r,, 0). The 
mysterious coordinates («, ¢) turn out to be merely the deviation of r, 6 from P, suitably scaled by r*. 
Explain how this works. Of course, this is just a theory: the civilization would now have to measure the 
quadratic terms in their metric to be sure, but preliminary measurements indicate that this theory will 
very likely work. Measurements show that the origin of the polar coordinate system is incredibly far 
away; nevertheless, an expedition is planned to visit this mysterious place. 


Find the locally flat coordinates on the Poincaré half plane. 


Show that for D = 2, the combination 2B) 1) — By1,72 — By2,1; measures intrinsic curvature. In the simple 
example discussed in connection with the tangent plane, since the combination dx? + dy* is invariant 
under rotation, it is equal to du? + dv, and thus ds? = dx? 4 dy? + dz? = du* + dv* + (uudu + vedv)? = 
(1+ p2u)du? + (1+ v2v2)dv? + 2wv uv dudv. Work out Buy,ao and the combination specified here. 


Calculate the combination 2B,) 17 — By1,22 — B22,1; for the metric found in exercise 4. 


Show that for D = 1, we can set g,, to 1 by a coordinate transformation and so curves have no intrinsic 
curvature. 


It is easy to introduce a coordinate singularity by a poor choice of coordinates. Start with ds* = dx? + dy? 
and let z = y”. Find the metric in terms of (x, z). 


Note that the coordinates (x, y, z) introduced for H? in appendix 3 are not locally flat. Find the transformation 
to locally flat coordinates. 


Two spaces described by the metric g,,, and g,,, are said to be conformally related if 


ave) = (x) gy) (16) 


Show that, given two infinitesimal line segments originating from a point, the angle between them is 
preserved by this conformal transformation. In particular, that is why the Mercator map of exercise 1.5.3 
was popular with navigators. (Bad terminology alert: The term “conformal transformation” often suggests 
to students that the two metrics g,,, and g,,, are related by a coordinate transformation. In general, they are 
not. Thus, it is probably better to call “conformal transformation” a Weyl transformation instead.) 


In the preceding exercise, if the metric g,,, is flat, then the metric g,,, is said to be conformally flat. In 
other words, a metric g,,, (dropping the tilde) is said to be conformally flat if there exists an Q such 
that g,,,(x) = 27(x)5,,,. (In fact, we have already encountered conformally flat spaces in exercise 14 in 
the preceding chapter.) In higher dimensions, a metric has to be very special (in particular, it must be 
characterized by a single function) to be conformally flat. But 2-dimensional surfaces are so “simple” that 
they are all (locally) conformally flat. Show this by a counting argument. 


Show that the sphere and the Poincaré half plane are conformally flat. (Again, a bad terminology alert: The 
term “conformal flat space” misleads many students into thinking that the space is flat. In general, it is not. 
For example, consider the Mercator map of exercise I.5.3: the sphere S? is manifestly not flat.) 


Find the curvature of the space described by ds? = ydx* + xdy?. 


Show that hyperbolic spaces are conformally flat. 
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Notes 


1. Later in life, he also appears in QFT Nut, older but not wiser. 

2. For pictures, Toy/Universe, p. 25. 

3. Can any space be embedded in E? If so, what is the minimum value of N for a given space? These nontrivial 
questions were answered by John Nash, the mathematician portrayed in the film A Beautiful Mind: Ann. Math. 
63 (1956), p. 20. 

4. I purposely use the letter r to emphasize that we can use the same letter to describe different things in 
different situations. I trust you not to confuse this r with the r in the usual spherical coordinates and which 
we restricted to be 1 in the preceding chapter to obtain the metric for the unit sphere. The r here is the “polar” 
radial variable in polar coordinates. 

5. For the purpose of this book, we call a Riemannian manifold a space whose metric is smooth enough to be 
differentiated an appropriate number of times. This may require finding an appropriate set of coordinates. 

6. The rigor-minded reader realizes that we actually need to check this. 
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“Classical” differential geometry 


I feel that it would be good for those readers seeing Riemannian geometry for the first time 
to work through some “classical” differential geometry! dealing with curves and surfaces, 
“real” stuff that you could actually see and “hold in your hands.” Throughout this chapter, 
we will be living in good old 3-dimensional Euclidean space. I am going to tell you how the 
greats like Frenet and Gauss thought about curves and surfaces. None of the fancy tangent 
bundle talk for us; we will just do it. Action, not talk! 

One advantage of this approach? is that you will gain a geometric feel for important 
concepts such as covariant differentiation, curvature, and the Christoffel symbol, which 
in some texts are introduced immediately in a more high powered and abstract fashion. We 
will, of course, also get to the more general and direct Riemannian approach? to curvature 
in due time. Hence, it is entirely possible for those who do not care as much as I do about 
“classical” mathematics to skip this chapter. 


Curves 


Consider a curve C given by X(/) parametrized by the length / along the curve. In other 
words, we start from an arbitrary point O on the curve, and pace off a distance equal to / 
along the curve. Our location is then specified by the vector X(). 

The unit tangent vector is f= X = = Following Newton, we denote differentiation 
with respect to / by a dot, as was already done in chapter I.1. (All vectors in this chapter 
have 3 components and will be labeled with an arrow.) Since we said that / is the length, 
namely dl 2_ dX -dX,wehavet-f=1. Differentiating, we obtain pt =O. Thus, the unit 
vector p defined by 


r=Kp (1) 
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me rs curve C 


osculating plane 


Figure 1 The moving trihedron and osculating plane of a 
smooth curve C. 


(known as the principal normal vector) is orthonormal to 7. You can see that «, equal to It| 
by definition, has to do with the curvature of the curve C at the point X. 

Imagine driving a race car along the curve: f t indicates the direction you are pointing in 
and tf how fast tyou have to turn the steering wheel. Anyway, we physicists recognize f as the 
acceleration X and « as a measure of the centrifugal force. In this analogy, / is time, and 
so speed is fixed to be 1. This is kind of a neat example of how physics and mathematics 
intertwine: the centrifugal force tells us about curvature. 

Next define the unit vector b(I) =7(I) x p(D), known as the binormal vector. As we move 
along the curve, we have what 18th century mathematicians called a moving trihedron 
formed by the triplet 7, p, and b. The vectors 7 and p form a plane, named the osculating 
plane (from the Latin for “kissing”) by D’Amondans Charles de Tinseau (1748-1822) in 
1780. See figure 1. 

By construction, f-b =7- (7 x p) = 0. Differentiating this equation, we get f-b+i- be 
0. Using p - b= 0, we find that bis orthogonal tof. Also, differentiating b - b = 1, we obtain 
b - b =0. Since b is orthogonal to both 7 and b, we conclude that 


b=-tp (2) 


Evidently, t, known as the torsion of the curve C at the point X, measures how the curve 
is twisting. If the discussion is unclear to you at any point, draw your own picture! 
The next question is how the unit normal p changes as we move along. Noting that 
p =b x t, we differentiate and use (1) and (2) to obtain 
=bxit+tbxt=tb—xt (3) 
We can package the three equations, (1), (2), and (3), known as the Frenet-Serret 
equations (in memory of Jean Frenet (1816-1900) and Joseph Serret (1819-1885)), more 


t 
compactly by introducing the 9-component object y = ( B ) Then 
b 


0 ck 0 
w=Ay, withA=|-« 0 1 (4) 
0 -rt 0 


The antisymmetry of A ensures that y - y = 0. 
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Surfaces 


We now graduate from curves to surfaces in the 3-dimensional Euclidean space E? we 
were born into. A surface embedded in E? is defined by Xx (x!, x2). In contrast to a curve 
X (J) parametrized by /, the surface is parametrized by two coordinates x", with the index 
taking on 2 values: this feature is of course what makes a surface a 2-dimensional object. 
Also, while the length along the curve / provides a natural parametrization, there is no 
comparable natural parametrization for a surface. If the discussion becomes too abstract 


for you at any point, you can always think of the familiar sphere (with x! = 0, x? = q) for 
which 


sin 6 cos@ 
X=| sind sin Q 
cos 6 


The two 3-component vectors é,, = 0X = ox, labeled by the index yz, form the basis 


vectors for the surface. For example, on the sphere, 


cos 6 cos g — sin @ sing 
é;= | cos@sing and @,=] sin@cos@ 
— sin é 0 


To make absolutely sure that there is no confusion, let me say again that X (x) lives in 
the ambient 3-dimensional Euclidean space E, and é,,(x) are two 3-vectors labeled by 
ju = 1, 2. Note that x!, x? are coordinates on the surface, not components of X. 


Tangent plane and metric 


Linear combinations of the two basis vectors span the tangent plane at the point labeled 
by the coordinates x; in other words, the set of all points represented by ué,,(x) = 
ulé,(x) + u7é>(x), for u', u* any two real numbers, form the tangent plane. The tangent 


plane changes as we move around on the surface, of course. For example, on the sphere, at 
0 
the point (@ = 2/2, g = 0), the tangent plane consists of ( u2 ) asuand u2 range over the 
—u 
real numbers. For another example, again on the sphere, at the point (06 = 7/2, g = 1/2), 
—Uu 
the tangent plane consists of { 0 } again as u! and uw? range over the real numbers. 
—u 
Another familiar example is the cylinder with radius a, for which (with the choice 
= acos@ 7 —asing - 0) 
xt=g,x*=z) X=( asing }. It follows that é,={ acosg ) and é,=( 0 }. Note that 
Z 0 t 
if we think of a as a length, é, and é, do not have the same dimension. 


Since dX = a,x dx", the distance squared between two neighboring points with coor- 
dinates x and x + dx is given by ds*=dX -dX = (0,X -0,X)dxdx” =é, -é, dx"dx”. 
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sl 


Figure 2 The tangent plane and the normal to a 
surface at some point P. 


In other words, the metric on the surface is 


Suv = en -éy (5) 


We use the Einstein repeated summation convention: all repeated indices are summed 
over. For example, dX = 0, Xdx = 0,Xdx! + a)Xdx?. You should check that (5) gives the 
familiar result ds? = d6? + sin 07d? for the sphere. For the cylinder, ds* = é, - é;dg? + 
éy - é,dz* = a2dy? + dz’. 


As we move about on the surface, how does the tangent plane rock and roll? 


We now ask how the two basis vectors change as we move about on the surface. Consider 


> 


u,v = yey = 0,0,,.X . Here for typographical convenience, we have introduced the stan- 


LL 
dard notation € ,, = 0,,€ for any expression €. Thus, é,, ,, denotes a 3-vector labeled by two 


indices yz and v. In general, this vector will have a component pointing out of the surface. 
Next, denote the unit normal to the surface by 


~ exe 
i= oS (6) 
ley x e] 
(not to be confused with p in our discussion of curves, of course). See figure 2. For example, 
cos @ 


for the cylinder, n = ( sing }. 
0 


As I just said, the vector é,, ,, namely the change of é,, in the direction v, sticks out of 


Hv? 
the surface and thus has a component along 7. So let us expand é,, ,, in terms of the basis 


vectors é, and n: 
> == iA, > > 
euv= Pivea + Kyyn (7) 


Since 0,,0, = 0,0,,, the vectors é,, , and the expansion coefficients rs , and K,,,, are symmet- 
ric in their two lower indices. (For future use, we note that ry , is known as the Christoffel 
symbol.) Contracting (7) by the normal vector, we obtain 


Kyy =ey,y (8) 


known as Gauss’s equation. 
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We hasten to give an example. For the sphere, with X, é,, and é as given before, 


ps 


we have n => X, 11 = 04e1 =-n, 74 => 4€3 => 1,2 => Coral = cot 0é, and e2,2 => de> = 


— sin 6 cos 0@, — sin? 67. From (7) we read off 


Ky,=-1L Ky =— sin? 6 (9) 
and 
ines =cotd, i =—sin6@ cos (10) 


with all other entries of K and I vanishing. 

By drawing a picture, you can see that, as we move from one point toa neighboring point, 
the change of €, and é) projected into the tangent plane (the first term in (7)) tells us how 
the tangent plane is rotating around the normal vector n, while the change projected in the 
direction of the normal (the second term in (7)) tells how the tangent plane is “rocking and 
rolling.” Thus, the coefficients K’,,, tell us about how the surface is curving in the ambient 
3-dimensional Euclidean space. 


Covariant derivative 


Let W(x) be a vector field. In other words, at every point* x, two numbers, W1(x) 
and W?(x), are given, so that someone living in the ambient E? sees a vector field 
W(x) = W"(x)é,,(x). Since W is a linear combination of the two basis vectors, it lives 
on the tangent plane. In other words, it does not stick out of the surface. 

I now introduce one of the most basic concepts of differential geometry, that of covariant 
derivative. Unaccountably, some texts make covariant differentiation sound mysterious 
and complicated, when in fact it is intuitive and simple. Suppose we want to differentiate 
the vector field W“(x). It is not enough to ask how the components W1(x) and W?(x) 
change when we move from the point x to a neighboring point x + dx. The basis vec- 
tors é,,(x), against which the components are measured, are themselves changing. The 
covariant derivative simply takes into account this obvious geometric fact, namely the vari- 
ation of the basis vectors. This effect does not occur in Euclidean space: once we set up 
the usual unit basis vectors pointing in the x, y, z directions, they do not change. 

Mathematically, all we have to do is to differentiate using the product rule and follow 
our noses: 


By W(x) = 8, (WH (x)8,,(x)) = (By WH (X))E (x) + WH (x)A,B (2) 
= (0,W")é, + WTS 6, + WYK yt (11) 


In the first line, the second term expresses the effect we were just talking about: the basis 


* The phrase “point x” is of course shorthand for “point described by x in our chosen coordinate system.” We 
are a bit less precise, but remember, “brevity is the soul of wit.” 
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vectors themselves vary as we move about. In the second line, we used (7) and renamed 
some dummy indices we sum over. The key point is that 3, W contains a component along 
n, the normal to the surface. 

Imagine ourselves members of the mite civilization in the prologue. We do not know 
about vectors sticking out of our universe: all we know and care about are vectors lying 
inside our universe. Thus, we invite ourselves to define a covariant derivative* by dropping 
the term proportional to in (11): 


D,W = (0,W" + WT" )é, = (D,W")é, (12) 


In the last step we defined D,W” = 0,W" +I! W*. (Recall that I is symmetrical in its 
two lower indices.) 

From my experience teaching, I know that some beginning students get confused here. 
But really, the concept of covariant derivative is at heart quite simple. Think of yourself as 
a mite, and you don’t know about vectors sticking out of the surface that forms your world. 
So you just drop that component in the derivative, and you get the covariant derivative. 

You the mighty mite do not know about 9, W, only about D,,W. The concept of covariant 
derivative is central to differential geometry and hence to Einstein gravity. 


Parallel transport 


Let me explain the covariant derivative in a slightly different way. If we are living in 
Euclidean space and we want to differentiate a vector field, we simply follow Newton and 
calculate the limit of W(x + 6x) — W(x). But this implies that we know how to subtract a 
vector defined at the point x from another vector defined at a different point y = x + dx. 
Recall how we learned as children to subtract one vector from another. We were taught to 
slide one vector over to the other, so that their feathered ends coincide; the gap between 
their sharp pointy ends is then the difference we want. See figure 3. But of course when we 
slide a vector over, we have to take care that we do not rotate it; we need to keep it pointing 
in the same direction. 

Sliding a vector over taking care to keep it pointing in the same direction is known as 
“parallel transporting” the vector. Of course, in Euclidean space, parallel transport is trivial, 


| fo 


»——____—> 


Figure 3 How we learned as chil- 
dren to subtract one vector from 
another. 
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Figure 4 When a vector living in the tangent plane at one point on a curved 
surface is parallel transported to a nearby point, it will in general not live in the 
tangent plane at that point. The component sticking out of the tangent plane 
(dashed line) has to be projected away. 


and we do it without giving it a second thought. But if we are living on a curved surface, it 
is not so simple to parallel transport. Given a vector V at the point x, how do we parallel 
transport it to some other point y? 

To the more knowledgeable beings living in the ambient E°, it is obvious. Just parallel 
transport V in the ambient Euclidean space. Any child could do it, they yell. 

The trouble is that while the 3-vector V lives in the tangent plane at x, it doesn’t 
necessarily live in the tangent plane at y: it will in general have a component sticking 
out of that tangent plane. Thus, we have to project V onto the tangent plane at y. See 
figure 4. 

Well, we know how. Write V as a linear combination of @,(y), é)(y), and “(y), and 
then simply drop the piece proportional to 7i(y). The result is the vector V projected onto 
the tangent plane at y, which we denote by (V > y)p. You can work out (V > y)p in an 
exercise, but we don’t really need the explicit form here. Note that (V — y)p is a vectorial 
function of the vector V and of the location y. The subscript P reminds us that we are 
projecting. I have to ask you to understand the notation before reading on. Again, I have 
encountered an occasional student who finds the notation confusing. In fact, the notation, 
which may appear cumbersome, is needed to make the discussion clear. 

This discussion tells us not to blindly follow Newton and calculate the limit of W(x + 
bx) — Wx). No, Sir Isaac, we don’t want to compare Wx + 6x) against Wx). No sir, we 
want to compare Wx + 6x) against (W(x) > x+dx)p. 

And, dear reader, what does this clunky but informative notation (W(x) > x+ 5x)p 
mean? It means the result of the following procedure: we take the vector W(x) (corre- 
sponding to the V in the preceding paragraph), parallel transport it to x + 6x, and then 
project by throwing away the component that sticks out of the tangent plane at x + dx. 
That is the beast we want to subtract from W(x + 5x). 

So, put the difference Wx +6x) —(W(x) > x + 5x)p into Newton’s limiting machine, 
that is, take the limit 6x — 0 of this difference. 
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But the result of this “geometrical” construction is effectively exactly the same as the pre- 
vious construction, namely dropping the component proportional to n from the ordinary 
derivative 0,,W to define the covariant derivative D,,W, precisely as in (12). 

Yet another way of saying this is that the covariant derivative D,,W does not express how 
Wx + 6x) differs from W(x), but how Wi + 6x) differs from W(x) parallel transported 
to x + dx, that is, how Wi + 5x) differs from (W(x) — x + 5x)p. (Some students are 
perhaps confused because here, by parallel transport on a curved surface, we actually mean 
parallel transport in the ambient E? and then projection onto the tangent plane.) 

A rough analogy may help some readers. When you think about how much your 
income has risen, you don’t simply differentiate your income with respect to time. More 
meaningful is your income increase adjusted for inflation. It could be that your income is 
not increasing intrinsically, but the dollar (or whatever currency you get paid in) figure 
is rising because of inflation. Similarly, the covariant derivative D,, W is the ordinary 
derivative 0,W adjusted for the change in the reference frame. 

Thus, if we have a vector field V(x) satisfying D,,V = 0, then it is not changing intrinsi- 
cally. It has simply been parallel transported all over space, a fact of considerable military 
significance. 


The ancient art of war 


Imagine that it is 300 Bc and that you are the physicist-sorcerer attached to the army 
commanded personally by the Emperor the Son of Heaven. The other sorcerer, the astro 
guy whom you have always derided for staring at the sky, predicted a thick fog on the day of 
the battle, so that the soldiers would not be able to see which way was which. The Emperor 
ordered you to solve the problem. Having read this book, you immediately realized that 
your task was to parallel transport a vector so that it always pointed south. You quickly 
had an imperial south-pointing carriage constructed, on top of which was a statue of the 
Emperor pointing south. Indeed, the day was frighteningly foggy, a pea soup fog, as the 
Brits would say. The soldiers could barely see beyond an arm’s length, and the enemy 
became totally confused. In contrast, as the south-pointing carriage moved around this 
way and that, the statue always pointed south, and so the Emperor scored a huge victory. 
The Emperor was so delighted that you (and that astro guy too) received tenure at the court 
and lived happily ever after. 

Would I make something like this up? Obviously not. The south-pointing carriage was 
described in ancient Chinese chronicles. Unfortunately, the detailed construction plan was 
lost to posterity, but in the 20th century, a contest produced a modern reconstruction.* (See 
figure 5.) I won’t bother to show the engineering drawing here but pose the design to you 
as a challenge.” 


* Louis Grace of the physics department at the University of California, Santa Barbara, kindly built this war 
chariot for me, complete with a statue of Emperor Albert on top. 
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Figure 5 A modern version of the south-pointing carriage 
described in ancient Chinese chronicles. 


Gauss’s strategy 


To determine the curvature of the surface at a given point P, Gauss’s strategy was to 
study curves on the surface passing through P and calculate their curvatures. This sounds 
puzzling at first. How did Gauss expect to determine’ the curvature of a surface by studying 
the curvature of curves lying on the surface? 

To motivate his reasoning, consider the curvature at the saddle point P with coordinates 
(x =0, y =0) on the surface z = 4 px” — 5qy’, taking p > 0, q > 0 for definiteness (a 
special case of the surface considered in chapter 1.5). Imagine yourself walking along the 
ridge defined by y = 0: on both sides of you, the land falls away (see figure 6). When 
you reach the lowest point x = 0 on the ridge, the land in front of you and behind you 
rises up. Consider the family of curves going through P. For a curve pointing in the 
y direction, the curvature is positive, while for a curve pointing in the x direction, the 
curvature is negative. Evidently, for a curve pointing in some other direction, the curvature 
is intermediate between two extremal values. Gauss proposed to find these two extremal 
values. 

So for a given point P on a curved space, let’s look at some curve going through that 
point and study its tangent vector f = ré,, with t¢ = a. Then 


dt _ dt", | dey 


di_ dts B 
a eo A oe) 
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Figure 6 Determining the curvature of a surface with a 
saddle point P. 


Write the right hand side as some linear combination of é1, é,, and 7, so that 


y = Kye + ky (14) 
with é some unit vector given by some linear combination of é, and é). Note in passing 
that since f-f=1, we have 7 - dt =0 and so?-é@=0. The three unit vectors 7, f, and 
é form an orthonormal triad. Note also that é,, and 7 pertain to the surface, while ¢ 
and é pertain to the curve. 

To get a feel for the two quantities x, and x, in (14), consider a couple of surfaces. Draw 
a couple of pictures as you read the next two paragraphs. That will make it easy to follow 
what I am talking about, which is not much more than common spatial sense. 

First, a plane flat as Kansas. Through P draw a curve as curvy as you like, and you can 
make x, as large as you like. But try as you may, Le will stay in the plane by definition, and 
«, Will remain stubbornly equal to 0. Evidently, «, does not tell us about the curvature of 
the surface, merely the curvature of the curve. 

Next, picture a sphere, and let P be Copenhagen, say. Following Gauss, consider a variety 
of curves going through Copenhagen. For example, think of the circle of constant latitude. 
Then a points toward the axis of the earth joining the north and south poles and can be 
written as in (14) with e pointing north along some street in Copenhagen and n pointing 
upward* at the sky (as always, independent of where we are). As we try different curves, 
x, and x, vary. If we choose the curve to be the circle of constant longitude (rather than 
latitude) going through Copenhagen, «, vanishes, while |x,,| is maximized, given by the 
inverse of the earth’s radius. Indeed, |x,,| attains its maximum value for the two great circles 
going through Copenhagen. If you have trouble with this, try drawing a picture. 

This example shows clearly that it is «,, not «g, that tells us about the curvature of the 
surface. You can further convince yourself by picturing other examples, such as the ridge 
you were walking on earlier. While there is an infinite number of surfaces with different 


* With this convention (rather than pointing downward toward the center of the earth, as would be mathe- 
matically more natural), «, > 0 while x, < 0. 
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global properties, if we focus on a local patch of the surface around a point P, there are 
only a couple prototypical surfaces, sphere-like or ridge-like, so to speak. 


From curvature of curves to curvature of surface 


Using (13) and (7), we have (suppressing terms not of interest to us) 


dt dé, de, “! 
aia ae Oe a ee a (15) 


Comparing with (14), we can extract 
B= Rt (16) 


A digressive word about what appears to physicists as rather quaint terminology. Strip 
the dl off t" = dx" in «, and we encounter the combination K,,,dxdx", known to clas- 
sical differential geometers as the second fundamental form.’ Sounds pretty important. 
Then what do these guys call the first fundamental form? None other than our beloved 
infinitesimal distance squared: g,,,dx"dx". 

Next, differentiate ¢ - i = 0 to obtain (using (14)) 


ee (17) 
known as Weingarten’s equation after Julius Weingarten (1836-1910). 

Gauss’s idea was to look at all curves passing through the point P. For each curve, 
calculate «,. Following the great man, we are supposed to find the extremal values of k,. 
So, let us extremize K,,,t“t”. But r“ is not entirely free to vary: it has to satisfy g,,,t“t” = 1. 
Thus, we have a constrained extremization problem. 

But we know how to deal with this: introduce a Lagrange multiplier® k. In other words, 
vary K,,,t"t” — k(g,,t“t” — 1) with respect to t“ and set the result to 0, thus obtaining 
(Kyu — k&yy)t” = 0. Multiplying by g*“, we obtain an equation 


(g*"K,,, — kd4)t” =0 (18) 


for the eigenvalues of the 2-by-2 matrix M o = g’"K_,,. Contracting (18) with t, = g,pt?, 
we find k = K,,,t“t” =k,,. In other words, the two eigenvalues k, and k, give the extremal 
values of k,. 

As usual, the eigenvalues are determined by 


det(g-!K —kI) =0 (19) 
which (since g~!K is a 2-by-2 matrix) amounts to the quadratic equation k* — tr(g~!K)k + 
det(g~!K) = 0. Thus, the product of the eigenvalues is given by 


det K 


= det(g-!K) = 
¢ alg ) det g 


(20) 
and the sum by 


S=tr(g 1K) (21) 
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Intrinsic versus extrinsic again 


In story 2 in the prologue, and also in the preceding chapter, you learned to distinguish 
between extrinsic and intrinsic curvatures. A cylinder has extrinsic curvature by virtue of 
how it is embedded in the ambient Euclidean space, but it has no intrinsic curvature. We 
can unroll a cylinder into a flat piece of paper. Like the mite geometers of the prologue, we 
cannot get out of the universe we live in, and so we are interested in the intrinsic, not the 
extrinsic, curvature, as was mentioned in the preceding chapter. 

The quantities G and S present us with two measures of curvature. How do we know 
which one is intrinsic? The easy way is to look at the cylinder, just as we did in the 
preceding chapter. It also gives us a chance to try out the machinery we just worked out. 


: . 7 —a sing a 0 7 cos ; 
Recall from earlier in this chapter thaté; = [{ acosy },é@é,=| 0 J],andn=| sing ). First, 
0 1 0 


using (5), we already worked out the metric 21; =a’, 217 = 221=0, 877 =1. Next, we 
take derivatives: 0,é; = —an, 0,€) = 0)€ = 07€, = 0. Plugging these values into Gauss’s 
equation (8) K,,, =@,,)°7, we obtain K,,; = —a, with all other entries vanishing. Thus, 
det K = 0, so that G = 0. This immediately tells us (and told Gauss) that it is the product 
G that measures the intrinsic curvature of the surface at the point P. The sum S measures 
the extrinsic curvature. To be consistent with the previous chapter, we define the intrinsic 
curvature to be G and the extrinsic curvature to be € = (Gs y 


For the surface we started the last section with (and on which you have been hiking), 
x 


7 1 
we have X = y with x! =x and x*= y. It follows that 2é,=( 0 ) and@= 
y 1 2 


3 px?—hqy px 


0 
( 1 ) Note that to calculate the curvature at the point P, we need various quantities 
55 
only at P. At our chosen point P, where x = y = 0, the arithmetic simplifies. We have 


0 
n => ey x > => (° ) ds = ey ° dx? + ey * dy” = dx? + dy’, so that 81u= 1, 822 = 1, and 
1 
812 = 821 = 0. Furthermore, 0,€; = pn, 0,€. = —qn, and 0,é, = d,€, =0, so that from 
Ku 
S=p-4q. 
For a sphere of radius a, p = —q =a, and so G=a?=€. We have calculated these 


y =e,,)°N, we obtain Ky, = p, Ky, =~—q, and Kj) = Kj,;=0. Thus, G = —pq and 


quantities only at (x, y) = (0, 0), but of course for the sphere, this amounts to knowing 
the intrinsic and extrinsic curvatures everywhere. 

If G > 0, then the two eigenvalues k, and ky have the same? sign, and the surface at P 
is spherical. If G < 0, the two eigenvalues have opposite signs. The surface is shaped like 
a saddle (or potato chip) and is hyperbolic at P. If G = 0, the surface is cylindrical at P. 

The whole point of Riemann’s formalism (as will be developed in detail in part VI) is 
that we do not need to embed the space we are studying in some ambient Euclidean space. 
But we have lots of intuition about 3-dimensional Euclidean space (and, as I said before, 
who can blame us?), and so this 18th and 19th century differential geometry is a lot easier 
to visualize and to grasp. 

With the benefit of hindsight, Gauss’s strategy is also perfectly reasonable. ‘To find the 
curvature at a point P on the surface, you build race tracks going through P. Find the race 
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tracks with the largest and smallest possible curvature. The product and sum of these two 
curvatures measure the curvature, intrinsic and extrinsic, respectively. 


Appendix: Spherical coordinates 


Throughout this text, we will be often changing to spherical coordinates. You must have done this about a 
hundred times in courses on mechanics and on electromagnetism. So it might be good to collect some useful 
information here and to define w' = sin 0 cos ¢, w* =siné sin y,and w* = cos 0. Denote* the coordinates in E3 by 
(v!, v2, v3). Transform to spherical coordinates (r, 6, y) by v! = f(r)o!, i = 1, 2, 3. Differentiating Y(o!)? =1, 
we have )°; widow! = 0, where dw! = dgw' dd + dpai dg. Furthermore, )°; dgco! dco! =0, > (Aga! )? = 1, and 
> (0,0! )? = sin’ 6, so that (dw!)* = d6? + sin? odg?. 

Thus, the metric is given by ds* = 7; (dv')? = 1°, f’(dra! + f(r)do!)? = f'(r)*dr? + f(r)? 4 (do!) = 
f'(r)*dr? + f (r)*(d0? + sin’ 6dy?). For example, to derive (X.1.20) and (X.1.21), we will need to use a slightly 
generalized version of this result. 


Exercises 
te ff) cosy 
1 Calculate the curvature « and the torsion t of the exponential spiral X = ( i () sing ). Take f(y) = e*? and 
ly 


g(y) = 0 for simplicity. 
2 Find an expression for Vp (y). 


3 Show that if k, 4 ko, the two corresponding eigenvectors ft, and ft) are orthogonal in accordance with our 
geometrical intuition. 


4  Onacylinder, draw the curve defined by X 1_acosl, X2 =asinI, and X? = bl, with! the length. Show that 
a* + b? = 1. Calculate the curvature and the torsion. 


5 Calculate G for a unit sphere. (Anticipating a bit, we will see that G is equal to the Riemann curvature R171.) 


6 Show that ifthe mite professors of geometry are given the metric, they can determine the Christoffel symbol, 
even though they don’t know about n. 


Notes 


1. E. Kreyszig, Differential Geometry, University of Toronto Press, 1959. 

2. Several years ago, I gave some chapters from part I to a few University of California, Santa Barbara, 
undergrads to read. One of them emailed me a few days later. I cannot resist quoting, with insignificant 
edits, from his email. 


I’m very excited about your approach to introducing differential geometry first in E? before doing GR 
for multiple reasons. Conversing with another undergraduate physics major who took the undergrad- 
uate GR course here, we agreed that general relativity books and courses ought to spend more time 
describing intuitive and easily visualizable examples, and laying down as rigorously as possible (given 


* This is to avoid confusion with the X = (X1, X?, X3) in the text. 
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the constraints of GR courses needing to be about physics, not math) the meaning and definitions of 
curvature, geodesics, etc, at least partially through examples in E* and E%. I found it spooky that a day 
after having this conversation, I read these chapters and found they did basically what I had hoped an 
introductory book to general relativity would do. I found this chapter personally more useful than the 
other chapters because the way I learned differential geometry was the more abstract approach. 


Spooky indeed! 
. Indeed, you already had a first taste in the preceding chapter. 


4. In an alternative view, which I do not like as much, we can think of the covariant derivative as acting only 


on objects carrying the vectorial arrow. Insisting that D, is distributive just like 0,, some authors write 
D,W= D,(W"é,,) = (,W")é,, + W"(D,é,,). Note that in this formulation, D, acting on the “numbers” W“ 
is given by the ordinary derivative by definition. Comparing with (12) then gives D,é,, = Lies Referring 
back to (7), we see that D,é,, differs from 4,é,, by a term proportional to 7, as we might expect. 

. Here isa hint. Gears convert the rotation of the left wheel and the right wheel separately into rotations around 
the vertical axis. Another differential gear responds to the difference in the output from the two wheels and 
rotates the statue around the vertical axis. If the chariot is moving in a straight line, the statue is not rotated. 
But when the chariot moves along a curve, the right wheel (say) rotates more than the left, and this difference 
gets converted into a rotation of the statue around the vertical axis, compensating for the turning of the body 
of the chariot. In other words, as the chariot turns this way and that, the statue points in a fixed direction. 
To the extent that Riemannian surfaces are locally flat, the war chariot also more or less works on a surface 
that is not flat, provided that the chariot is much smaller than the radius of curvature, and after subtracting 
out the (negligible) vertical component of the vector represented by the Emperor’s arm. 

. Need I remind you to distinguish between these two uses of the word “curvature”? 

. The word “form” here does not carry the same meaning as the word “form” in chapter IX.7. 

. For those readers who have forgotten the notion of a Lagrange multiplier from their course on calculus, here 
is a quick review. The problem is to extremize a function f(x, y) with the constraint g(x, y) = 0. The brute 
force approach would be to solve g(x, y) = 0 to obtain y(x), eliminate y in f(x, y), and extremize f(x, y(x)). 
The same Lagrange you were introduced to in the text invented the following more symmetrical, and often 
better, method. Form the function h(x, y) = f(x, y) +Ag(x, y), where A is known as the Lagrange multiplier. 
Extremize h(x, y) to determine x and y in terms of A, and then impose the constraint g(x, y) = 0. Anexample: 
f(x, y)=ax + byand g(x, y)= F(x? + y? — 1). In other words, we are to extremize f(x, y) =ax + by, with 
x and y constrained to the unit circle. Following Lagrange, we obtain after the first step x = —a/A, y= —b/2. 
Imposing xe y? = 1, we haved = +a? + b?. Fora > 0, b > 0, the plus root gives the maximum of f(x, y) 
atx =a/Ja2+b2, y=b/Va2 4+ b2. 


. We are dealing with Euclidean surfaces here, so that det g > 0. 


Recap to Part | 


An essential feature of Newton’s force law is that it involves two derivatives. The presence 
of two derivatives will permeate almost everything we do. 

For a given situation, a judicious choice of coordinates makes our lives a lot easier. 
Coordinates can of course be freely chosen, but the square of the separation between two 
neighboring points, which according to Pythagoras has the form ds? = g,,,(x)dx'dx’, 
must not depend on the coordinate choice. Once we master the changing of coordinates, 
it is but a short hop over to curved spaces. 

Choosing coordinates wisely, we can always make our immediate neighborhood look 
flat. The extent to which we notice deviation from local flatness as we move away from 
our neighborhood is the measure of curvature. A simple counting argument tells us how 
many numbers we need to characterize the intrinsic curvature at a given point. To get a 
feel for curvature, it is good to spend some time back in the good old days with the likes 
of Gauss and play with some surfaces we can literally hold in our hands. 


Part II | Action, Symmetry, and Conservation 


The Hanging String and Variational Calculus 


The hanging string 


To understand the action principle, which to a large extent will permeate this book, we 
have to master a slight generalization of ordinary calculus known as variational calculus. 
In ordinary calculus, we take derivatives with respect to some variable, typically a real 
number. In variational calculus, we take derivatives with respect to a function. To learn 
what that means and to see how variational calculus arises in physics, let us start with a 
simple problem. 

Throw a marble into a bowl. When you come back later, you expect it to be sitting at rest 
at the bottom of the bowl. This is formalized by saying that, if we denote the cross section 
of the bowl by v(x, y) so that the potential energy of the marble is proportional to v(x, y), 
the position of the marble is found by solving av = O0and ge =0. 

To explain variational calculus, let us tackle a problem in baby string theory. Consider an 
ideal elastic string tied down at two ends and hanging under the force of gravity. See figure 
1 for how we set up our coordinates. Denote by ¢ (x) the amount by which the bit of string 
at x hangs below the horizontal line. That the string is tied down at the two ends gives 
us the boundary conditions ¢(L/2) = ¢(—L/2) = 0. We want to solve for the shape of the 
hanging string, which is of course determined by the tug of war between the downward 
pull of gravity and the elastic force trying to minimize the amount of stretch in the string. 

The elastic energy is given by a constant T intrinsic to the string times the stretch, 
defined as the length of the string minus the original length, which, as you could verify 
later, does not come into our calculation, so that we might as well take it to be L. To 
find the length of the string, think of how Newton and Leibniz discovered calculus and 
imagine dividing the string up into little segments labeled by j. Pythagoras tells us that 


\2 
each segment has length given by (see figure 1) [ Ax? + Ags = Ax;,/1+ ( a) . Taking 
the Newton-Leibniz limit, we see that the length of the string is equal to [’ /dx? + dg? = 
de dx,/1+ (42), (Henceforth, we will carry out this sort of manipulation without further 
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-L/2 A> L/2 


“AK 


Figure 1 An elastic string tied down at two ends hangs under the force 
of gravity. 


ado. We will also tend to suppress the integration limits.) Thus, the elastic energy is equal 


to T(f dx 1+ (4%)? —L) =T f dx(/1+4 (44) - Pp. 


For pedagogical clarity, we consider the case where Bu <1, that is, when the string 


is stretched only by a little bit, so that \/1+ (4? ~14+ a(ee)? (It turns out that this 
simplification is not necessary, and you can work things out without it. See exercise 1.) 
The gravitational energy is given by { dx(—ogd(x)), where o denotes the mass per 
unit length of our string. Note the minus sign: we have chosen ¢(x) to point downward, 
as indicated in the figure. Again, we have assumed that the stretch is small. 
Thus the string has energy 


TG 2 
E(¢) = [ dx (Z (32) - ox600) (1) 


Extremizing a functional 


Note that £ is not a function of a real variable named @. Rather, E is known as a functional 
of a real valued function ¢(x). When you plug a number x into a function f, you get out a 
number f(x). Analogously, when you plug a function ¢(x) into the functional E, you get 
out a number, the string energy E(@). For total clarity, we could write E(¢(-)), with the 
dot emphasizing that ¢ is itself a function. Note that we should not write E(¢(x)) in (1): 
x is adummy integration variable on the right hand side. 

Our task is to find the specific function (x) that minimizes the energy E of the string. 
In ordinary calculus, we differentiate a function with respect to its argument and then set 
the result to zero to find the extremum of the function. Analogously, in variational calculus, 
we differentiate a functional with respect to its argument, which is a function, and then 
set the result to zero to find the extremum of the functional. 

But how do we differentiate with respect to a function? 

For pedagogical clarity, let us go back to the marble in the bowl and write dv = av 5x + 
a8 y to first order in 6x and dy, some arbitrary and small variations in the position of the 
marble. We notice that 5v vanishes if and only if ue = 0 and - = 0. To first order in 5x 
and dy, the variation of v vanishes if we happen to be sitting at an extremum. To second 
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order, the variation of v is positive if we are at a minimum. In other words, we nudge the 
marble and if it costs us energy, we know that it is sitting at the bottom of the bowl. All of 
this is elementary and well understood by you. So now we do the same: we push the string 
slightly and ask if it costs us energy. 

We vary the function (x) by changing it to #(x) + n(x). We then compare E(¢ + n) 
and E(@). Take n(x) small, so that it suffices to calculate the change in energy 5E = 
E(¢+ 7) — E(@), expanding in 7. To first order in n, 5E should vanish. If the shape $(x) 
minimizes the energy, 6E would furthermore be positive to second order. 

Let us go slow and first deal with the second term in (1): taking out the overall constant 
—og, we have f dx{(@(x) + n(x)) — o(x)} = f dx n(x). That was easy! 

The first term in (1) is only slightly more difficult to deal with. Since 4(¢ + n) = 


dg , an 
ax + —, we find 
x=L/2 2 
do 
+ fas (-2") (2) 
x=-L/2 ax? 


dx’ 
In the last step, we integrated by parts. Note that our boundary conditions tying down the 


fos (+3) - GE) ) =f Gz) =() 


string at the two ends, n(x = L/2) =0 = n(x = —L/2), imply that the boundary terms in 


(2) vanish. 
Putting it together, we have 
ah 
bE = fas Gee = as) n(x) (3) 


Since n(x) is arbitrary, 6£ can vanish only if the integrand in (3) vanishes. Thus, the shape 
of the hanging string is determined by the differential equation 
2 

T na =-og (4) 
namely, a graceful parabola described by $(x) = sé {(F)" — x*}. At the two ends, o(5) | 
o(- £) = 0, which just expresses the boundary conditions. In the middle, ¢(0) = oF (5). 
(Remember our convention that ¢ > 0 means hanging down.) 

The physics is simple, and the math merely describes the physics. In (1), the first term 
wants to make (32)? small, that is to make ¢(x) constant, and hence ¢(x) = 0 with the 
given boundary conditions. In contrast, the second term wants to make ¢(x) as large and 
as positive as possible. The actual shape is a compromise between these two terms. This 
theme of compromise will pervade this book. 


General lessons 


We are less interested in the hanging string than in what general lessons we can learn 
from this simple example. Here are some remarks. 


1. Evidently 6 f dxg”" = f dx{(@ +n)" — 6")} ~ f dx ng"~'n, where for the sake of notational 
simplicity, we have suppressed the x dependence of @ + 7. Since any functional of @ can 


be expanded as a power series, we have, for instance, 5 f dx cos ¢ = f dx(cos(@ +n) — 
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cos¢)~ — f gee ¢)n. In general, 5 [ dxV(o) = f dx(V(¢+n) — V(p)) ~ f dxV'(o)n, 
where V’(@) = & is computed by pretending that V(@) is a function of a real variable ¢, 
ignoring the Be ek ¢ is itself a function. 

Now that the utility of n has come to an end, we might as well, in analogy with ordinary 


calculus, define the functional derivative by 
t) 
— | dxV =v’ y 5 
360) / XV (P(x) (o(y) (5) 


Note that to be careful, we have restored the argument of the function ¢. It is important to 
realize that here x is a dummy variable to be integrated over, but y is a “free” variable: we 
are free to vary the function ¢ at a point y of our choice. To our satisfaction, we see that the 
variation depends only on quantities evaluated at the point y. This states that the energy 


density is a local quantity and does not depend on what is happening far away from y. 


2. Suppose we now have to vary f dx F (Ze ax oy. By the same reasoning, we have 
sea PE) ha) le 
d dx dx dx dx } dx 
Sela ha) 
dx dx 


where, as before, we have integrated by parts and dropped the boundary terms since n(x) 


vanishes at the boundaries. Once again, F’ is defined as if the argument of F were an 
ordinary real number. In our simple example, F (u) = zu? and so F’(u) = u. See exercise 1 


for another example. Again, we can drop 7 and write SON faxF (42) = =— 4 F’ (2 f). 


Thus, in general, if the energy functional is given by 


E(¢) = / dx {F (3) " vos] (6) 


with the boundary condition that ¢ (x) vanishes at the integration endpoints, we obtain the 


equation 


d , (4¢ ' 
a (3) Vi@@)) =0 (7) 


Even more generally, if the energy functional is given by 


e(@) = f axe (4 6 4) (8) 


again with the standard boundary condition that ¢ (x) vanishes at the integration endpoints, 


we obtain 


d 6€ b€ 
4 (3)-E-0 (9) 


Verify this: you will grasp the notation better. - in (5), we now pretend that (a, b) is an 
aD 


ordinary function of two variables a and b. By ~ ot we mean 
oir 


equal to @ # and b to @(x). I will leave it to the reader to figure out wha 


with a subsequently set 


bE 
t 5G o 
The Dae (9), known as the Euler-Lagrange equation, is of fundamental importance 


means. 


in theoretical physics. 
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3. Bad notation alert! The present discussion highlights the notational confusion I mentioned 
in chapter I.1 that bedevils some students. For the marble in the bowl problem, it would 
have been best to reserve x and y for spatial coordinates and to denote the position of the 
marble by g, and q), say. In the mechanics of point particles, it is standard to abuse notation 
and use x, y,... for both spatial coordinates and for the positions of particles. Generally 
there is no confusion. 

But when we go to continuum mechanics, such as our hanging string problem, we must 
distinguish between dynamical variables (in our example, just the function ¢(x)) and spatial 
coordinates (in our example, the single coordinate x). Here x serves as a label to tell us which 
infinitesimal segment of the string we are talking about. The displacement of that particular 
string segment from where it would have been were gravity turned off is given by #(x). (This 
is a mouthful for saying that ¢(x) denotes the amount by which the string is hanging down 
at the point x, but I want to be precise and academic here.) Note that the 2-dimensional 
position of that particular string segment is given by (x, #(x)). Thus, in some sense, the 
letter x, seen this way, is doing double duty, both as a label and as a position. When we 
introduce time in a later chapter, it will become clear that x has no dynamics but ¢ (x) does. 
Some books use the horrendous notation y(x) instead of ¢(x), which has caused endless 
confusion. I belabor these rather obvious points because I have seen too many students 


getting confused, particularly when they encounter field theory, classical or quantum. 


4. Another bad notation alert! In most textbooks, the variation of ¢(x) is written as 5(x). 
This confuses some students, because they think of 5 as some operation acting on (x) 
and quite legitimately worry whether the two operations 6 and - commute. Instead, 


I write (x) for 6¢(x). In particular, a manipulation analogous to the first step in (2), 


£(b +n) ae = ao + 41 _ 46 — © proves that 542 really is equal to “2. Now that I 


B3 dx dx ~— dx’ dx 


have clarified this point, I will mostly lapse into the more explanatory notation 5@(x), which 


also avoids introducing yet another symbol. 


5. The careful reader might worry that we have found only the extremum, rather than the 
minimum, of E(¢). In most problems, that the solution is a minimum of the energy should 
be physically clear, as is the case here. To show that our solution is indeed a minimum of 
E(@), we would have to expand to second order in 6@(x) = n(x). This is especially easy in 
our example here, because V(@) is linear in ¢, so that we can read off from (2) that the 


second order variation of E(¢) is given by the manifestly positive quantity { dx(42)2, 


6. With some practice, you will be able to do variational calculus without having to go through 
the steps we went through here. When you go on to study quantum field theory, you will 


encounter these so-called functional derivatives all over the place. 


7. Instead of gravity pulling down on the string uniformly, we could load the string unevenly. 
Indeed, for convenience, let’s define E(@) with an overall factor of T taken out in (1) and 


replace the constant og/T by a specified function p(x), so that 


7 1(a6)* _ 
E@)= [ ax G2) poet) 
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From string to membrane 


Several possible generalizations immediately suggest themselves: we could increase 
the number of spatial coordinates, or we could increase the number of functions, or 
both. We consider the first possibility here and defer the second possibility to the next 
chapter. 

Go from a hanging string to a hanging membrane (or “brane” for short). See figure 2. 
We generalize the energy functional (1) as amended above to 


2 2 
E(@) = dxdy (3 (2) 4 (**) pcs ste.) 


with the boundary condition that ¢(x, y) vanishes along some nice closed curve (such 


as a circle) in the (x-y) plane. Note that E(@) involves an integral over the two spatial 
coordinates (x, y). 

While you can easily derive this expression for the energy by working out how much 
the membrane is stretched, we can simply use rotational invariance to fix the form of 
the integrand: the energy should not change when we rotate (x, y). Recall from exercise 
1.4.1 that (2) + (ab° transforms like a scalar and the accompanying discussion in 
chapter I.4. 

Again the physics behind the various terms is clear. To minimize the first two terms, we 
want ay and ey to be as small as possible, that is, to stretch the membrane as little 
as possible. In contrast, to minimize the third term in E (~), we want ¢ to be positive (we 
are taking the load p(x, y) to be positive) and as large as possible to lower the energy. Just 
as in the hanging string, it is the struggle between these two terms that determines the 
shape of the membrane. Note that once again we choose to have ¢ point downward; hence 
the minus sign in the potential term. 

Varying E and going through the same steps as before, we generalize (2) trivially to 


1((a¢  an\?. (a. an\? (a¢\* (ab? ap a? 
feos (me) 15 +a) Ge) -G) ee Gat ae) 

2 Ox Ox dy ody Ox oy ax2 dy 
Thus, we obtain Poisson’s equation! V*@(x, y) = —p(x, y), with the Laplacian V7 = 


a2 a2 


iss 


Figure 2 A hanging membrane. 
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Newton’s gravitational potential as a field 


A deep strand . . . was his total love of the idea of a field... . 
which made him know that there had to be a field theory of 


gravitation, long before the clues to that theory were securely in 
his hand. 


—Freeman Dyson speaking of Einstein 


I went through this membrane example for a reason. Newton’s gravitational potential 
® satisfies 


V(x, y, z) =42Gp(x, y, 2) (10) 


with V7 = ae + a + oo and p(x, y, z) the mass distribution. (If you are not aware of 
this, we will show explicitly, for p describing a point mass, that this equation gives for ® 
the Newtonian gravitational potential.) What we just did for the membrane tells us that 
(10) for Newtonian gravity emerges from minimizing the energy functional 


E(@) = f as (500+ Gea) (11) 
872 G 


Note the plus sign in (11), which reflects how the gravitational potential is defined so that 
amass m located at x in the potential has energy +m®(x). 

By the way, E(®) defines a classical field theory, and ®(x) is known as a field,* as 
it pervades space, just like the familiar electromagnetic field. The hanging string and 
membrane allow us not only to introduce the variational principle, but importantly, also 
the notion of a field. 

To verify that (11) leads to (10), we could simply invoke the membrane example or use 
the Euler-Lagrange equation (9). Alternatively, it is easy enough to vary (11) directly, going 
through the same steps as earlier in this chapter: 


= 3 pete rs a)= far | ev a} - 
sE(@)= fd P (0 V5) + p10) = | Px} BVO) + 0} 0@) 


where we have integrated by parts. Setting the coefficient of 5@ to zero yields (10). 

It suffices to solve (10) for a point mass at the origin,’ that is, for p = Mé3(x) = 
M5(x)5(y)5(z). Here we generalize the Dirac delta function defined in chapter I.1 to a 
3-dimensional delta function, as indicated and discussed in exercise I.3.2. Recall that you 
can think of the delta function 53(x) as essentially a function sharply peaked at x = 0. Thus, 
p(x) is sharply spiked at the origin x = 0 and vanishes everywhere else. The total mass 
i d>xp(x, y,z)= M(f dx5(x))(f dyd(y))(f dz6(z)) is equal to M. 


* But, at this point, merely a static field without any dynamics, that is, without any dependence on time. 
T For an arbitrary mass distribution, we can imagine p(x, y, z) as being composed of point masses and add 
the contribution from each mass to ®. 
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Dimensional analysis* determines the solution of (10) up to an overall constant: since 
V? goes like 1/L? and 53(x) like 1/L, the potential can only go like 1/r. As you know, it 
is in fact P = -GM_/r. To verify the overall constant, we integrate the two sides of (10) 
over a ball of radius R centered at the origin, obtaining for the left hand side f d?x V7 = 
{dS -V® = 47 R*(GM/R?), and for the right hand side [ d3x 4% GM53(x) = 42 GM. In 
other words, we have derived the useful identity 


v2 (-<) = 53(x) (12) 


4ar 


When you first learned about the inverse square law, did you not wonder where the 
inverse square comes from? This discussion shows that it is essentially a consequence 
of the form of the (V®)? term in E(®), required by rotational invariance, which in turn 
leads to the Laplacian in (10). The inverse square then follows by dimensional analysis. 
You realize of course that the electrostatic potential satisfies the same equation (10) if we 
interpret the right hand side Gp as the charge density. This is not an accident, but is due 
to the same deep consequence of rotational invariance. 

Anticipating, you will see that powerful invariance requirements also fix the form of 
Einstein gravity. 


Appendix 1: The lion by his paw prints 


The brachistochrone problem makes for a great physics story. One winter day in 1697, when Newton (1642— 
1727) was 97 — 42 = 55 (not old by modern standards but old in the age he lived in and in any case, long past 
he creative brilliance of his youth), he received a letter from Johann Bernoulli (1677-1748) posing the following 
problem. Fashion a stiff wire into a curve connecting two points A and B, as shown in figure 3. Thread the wire 
hrough a bead of mass m, as shown. Gravity is acting downward as usual. Release the bead at rest at the higher 
point, say A, and let it slide down the wire. What should the shape of the wire be if the bead is to reach B in the 
east amount of time? (In Greek, brachistos means “shortest” and chronos means “time.”) 

Galileo had erroneously thought that the curve was a circular arc, but he can be excused because, unlike you, 
he did not know variational calculus. 

Newton recognized this as a brazen attempt by one of the best minds on the continent to embarrass him. In 
he bitter and contentious controversy between Newton and Gottfried Wilhelm Leibniz (1646-1716) over who 
invented calculus, Bernoulli had sided with Leibniz. By this time, Newton was a high-level government official 
in charge of the Royal Mint. England was in the middle of issuing new coins in an effort to combat widespread 
counterfeiting. Newton’s job was demanding and included catching and executing counterfeiters. The old man 
had had a long day at his day job, but he took up the gauntlet, working feverishly into the night, surprising his 
dedicated servants. Newton had the solution before dawn. By gosh,’ he still had all his marbles together. 

Now what did Newton do? He published his solution anonymously in the next issue of the Philosophical 
Transactions of the Royal Society. When Bernoulli saw the elegant solution, he exclaimed, “Tanquam ex ungue 
leonem!” (“One recognizes the lion by his paw prints!”) 

In fact, only three other physicists in Europe (besides Bernoulli) were able to solve what was at the time a 
fiendishly difficult problem: Bernoulli’s older brother Jacob Bernoulli, the Marquis de l' Hospital? (of the rule you 
learned in calculus), and the great Gottfried Leibniz. The variational calculus was not invented until 1766 (almost 
70 years after the lion chose to show his paw prints rather than to roar) by Leonhard Euler (1707-1783) and then 
subsequently refined by Joseph-Louis Lagrange (1736-1813). 

It is a bit of a shame that anonymous scientific publication is no longer done these days. Lots of fun stories 
would result, no doubt. 


* From f dxd(x) = 1, it follows that 5(x) has dimension of an inverse length, like 1/L, and thus 53() has 
dimension 1/L?. 


ll.1. The Hanging String and Variational Calculus | 121 
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Figure 3 The brachistochrone problem. A bead is released at rest from A and 
slides down a stiff wire fashioned into a curve connecting A to B. What should 
the shape of the wire be if the bead is to reach B in the least amount of time? 


Before you read on, see if you can go toe to toe and mano a mano with Isaac and solve the problem before 
dawn! The solution is given below. 

Perhaps you can see intuitively that the correct shape looks like that given in figure 3. The dumb guess would 
be the straight line joining A and B. What you want to do is to start out dropping as vertically as possible to pick 
up speed. Depending on the position of B, it may actually pay to drop below B to attain more speed and then 
“coast” back up. With beads and wires, you could experiment by holding races. 


Appendix 2: Another approach to functional variation 


I mentioned that there are two approaches to variational calculus. The other approach is to discretize: replace the 
continuous variable x by the discrete variables xj=ja,j=—N,...,N—-1,N with Na = L/2. We denote P(x;) 
by @;. In some sense, we go in the opposite direction from that taken by Newton and Leibniz when they invented 
calculus. As already alluded to above, we mentally imagine dividing the string up into tiny segments. In the limit 


a —> Oand N > oo, we should recover our continuous string. Thus, we write [ dxV(¢(x)) ~ a yj V(@;), so that 
sas | avo) ~ a2 TVG) =aV'O) 
56(y) dy ' 


with k determined by y ~ x,. Discretization reduces functional differentiation to ordinary differentiation. The 
derivative term requires a bit more work: 


do\* 2 
fax(S) ~a YGjs1 - 40/0) (13) 
j 
so that 
5 (d\ 1d eae 
aay / as( #) 1 day tn 8) 
2 ad 
= —(- pi — Oy) + (Pp = Dey) 2a (-3) (14) 
a dy 


I leave the reader to check that this reproduces all the results we derived before. 

I mention one rather trivial technicality, which, however, might bother some fastidious readers. For discrete 
variables, clearly i = Six, where the Kronecker delta 5 ik is defined, as in part I, to be 1if j = k and 0 otherwise. 
For continuous variables, we would like to write 
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5px) _ 
56(y) 


with the Dirac delta function introduced in chapter I.1. In particular, this will reproduce (5). Since 6(x — y) has 
oe as something like an ill-defined ag a . To have the 
correct dimensions, we can either multiply the right hand side of (15) by the “short distance regulator” a, or more 


simply and preferably, absorb a into the definition of SEGY 


d(x — y) (15) 


dimension of an inverse length, we shouldn't think of 


Exercises 


1 Inthe hanging string problem, if we did not make the approximation that the string is stretched only a little 
bit, we would have F (wu) = /1-+ u? (see remark 2 after (5)). Find the equation determining the shape of the 
string. 


2  Whatis the analog of the inverse square law in D spatial dimensions? We will need this result in chapter X.2. 


3 Consider the functional S(a, b) = ihe dr r(1— b)a’ of two functions a(r) and b(r) (with a’ = da/dr). Find 
the a(r) and b(r) that extremize S, with the boundary condition a(oo) = 1 and b(oo) = 1. 


4 Denote the downward displacement of a hanging membrane by ¢(x, y). Show that the amount of area 
by which the membrane is stretched is given by [ dxdy (dey? + (3). Hint: Use what you learned in 
chapters 1.5 and I.6. 


5 Solve the brachistochrone problem. By the way, the solution contains a “moral to the story.” 


Notes 


1. There is a movie about a fish called Wanda, but not one about a fish called Poisson. 

2. Amild anachronism: the euphemism was introduced circa 1750. Incidentally, I originally wrote “by Jehovah,” 
but one reader thought that very few readers would know who Jehovah was. Too bad. 

3. Incidentally, although the French often snickered at the ignorance of American physicists who talked about 
the hospital rule, the marquis himself spelled his name with an s, as in “the hospital.” 
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Variational calculus with several unknown functions 


In the previous chapter we considered a functional of a single function ¢, which could 


depend on more than one variable x!, x”, ..., x”. We could also easily consider a func- 
tional E(¢1, ¢2,...,¢,) =f dxe(™, Py, es, - ¢,) that depends on several functions 
$j(x), j=1,...,2, each of which is a function of a single variable x. Simply vary each 


function ¢; (x) to obtain n Euler-Lagrange equations (you should verify this): 


d bE b€ 
dx (| 5; ” 


dx 


to be solved for the n unknown functions ¢1,...,¢,. This is no different from the 
conceptual jump from the calculus of a single variable to that of many variables. Note that 
the formalism is completely general; in particular, E is just a functional, not necessarily 
having anything to do with energy. More generally, we could of course also consider a 
functional of several functions, each of which depends on several variables. 

Armed with this understanding, you can now solve various classic problems. One of the 
most celebrated is that of finding the path of shortest distance between two given points, 
the so-called geodesic problem. 


Reparametrization invariance 


To start out easy, let us look at paths in flat 2-dimensional Cartesian space. A curve is 
described by a set of points labeled by two real functions x(A), y(A), which vary as the 
parameter A varies. (See figure 1.) The choice of A is arbitrary: the only requirements that it 
increases monotonically from some initial value A; to some final value A ¢ as we move along 
the curve from the starting point A = (x(A;), y(;)) to the endpoint B = (x(A¢), y()). 
Imagine the curve as a highway on a map connecting city A with city B. We could 
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y 
Lo yy (x(A), yA) 


A 


Figure 1 To describe a curve be- 
tween two points, we have con- 
siderable freedom in choosing the 
parameter i. 


parametrize where we are on the highway by specifying the amount of gas consumed 

for example, or the number of songs sung, but it is clearly more sensible to use for A the 

actual road distance / we have covered. As we will see, while the choice of parametrization 

is up to us, mathematics and common sense favors the particular parametrization A = /. 
The length of the curve is given by 


Jae [0 (BY (2) ° 


The length, being geometric, is manifestly reparametrization invariant, that is, indepen- 
dent of our choice of 4 as long as it is reasonable. This is one of those “more obvious than 
obvious” facts, since { /dx* + dy? on the left hand side of (2) is manifestly independent 
of i. 

Your calculus teacher probably told you not to write something like the left hand side 
of (2): a properly formulated integral should look like the right hand side of (2). But as I 
already said in the preceding chapter, it is perfectly kosher: just think of { dx? + dy? as 
the sum of infinitesimals )°; ,/ (Ax)? + (Ay)?. 

If we insist, we can check the reparametrization invariance of (2). First, obviously the 
powers of dA match, so that if we scale 1 — ad, the length is unchanged, as it should be. 
More generally, if we use another parameter 7 related to A by A =A(n), then 4 = a a 


and da = aed n. Plugging in, we have 
hy 2 
pale 
hy dh 


Finding the straight line 


Of course, this particular example is trivial to solve, but that’s the point, using a trivial 
example to illustrate general concepts. You have probably already realized that it would 
be sensible to put the starting point A = (0, 0) at the origin and to rotate axes so that the 
endpoint B sits on the x-axis, in which case we might use the parametrization 4 = x, so 
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that E = f dx,/1+( )2. Obviously, the length is minimized by setting ay = 0. Imposing 
the boundary conditions, we obtain the solution y(x) = 

We, however, do not want to solve the problem in the simplest possible way, but more 
generally, so as to learn about the calculus of variation. Well, all we have to do is to use (1), 


with the notational shift 6; > x, ¢, > y, x > A. In this almost laughably simple example 


E(x, y= f dé, with E(#, 2) = (4)? + (2 so that (1) gives two equations for two 


unknowns: 


d i Le a 1 AN i 0) 


BRN ery a(n eh ai N jane Byte 


These are easy enough to solve by inspection: os and ay are both constant, independent 


of 2. Hence we have shown, as expected, that the straight line gives the shortest distance 
between two points. 

The more important lesson here, however, is to observe how (3) simplifies with the 
“preferred” or natural parametrization, namely to use for 4 the length / along the curve, 


defined by dl = \/dx? + dy2, so that 
(4) «(Gy 


dl dl dl 


Replace A by / in (3). Watch the square roots disappear, so that (3) simplifies to 4 Gz = Oand 


ay = 0. We will exploit this simplification ruthlessly when we get to Einstein’s theory of 
gravity. 


The ae reader might realize that there is a third equation, the definition of /: 


(= =)? + (2 p= = 1. Differentiating this equation with respect to /, we obtain ie ax + 


dy ‘2 y 
dl diz — 
given. 


= 0, and thus this equation is not independent of the other two equations already 


The world’s most complicated description of a straight line 


I prefer to go slow here, and so, instead of jumping into curved spaces immediately, let’s 
stay on the flat plane but now use polar coordinates. The length of a curve connecting the 
two points is then given by 


[ vars ran = [" eva +n (ay = [ax (4) 


dn 


with L now playing the role of € in (1). 
A thorough understanding of this example will be important in mastering Einstein 
gravity later, so to ee sure that es follow, let me spell out the steps here. First, we 


vary L with respect to ue “ and obtain +4" with L the ote root agua in (4). Then we 


tak 


vary with respect to r and obtain ins By? Thus, we obtain 4; AG; a) tr r(4 8)? = 0 asin 


(1). Once again it is clear that we could rake life easier by using the length parametrization. 
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Thus, setting 4 to /, we have L = 1, so that this equation simplifies to 


dr do\* 
wae —) =0 5 
diz” (7) (5) 
Similarly, repeating for 6 what we did with r, we find 
2 
o (Pe) =o a0 2dr dé _ 4 (6) 
dl dl dl? r dl dl 


In addition, we also have of course 


(a) en) ; 


Confusio: “But if we have (7), isn’t the integrand we are varying in (4) just 1? How do 
you vary 1>” 

Dear reader, if you are confused also, just use some other parameter 4, obtain the 
variational equations with the square root in it (as in (3), for example), and then replace A 
by J, as we did above. 

Confusio: “But we have three equations (5), (6), and (7) to determine two unknown 
functions?” 

You could choose any two of the three. The third then provides a consistency check, if 
you want. 

Confusio: “Which two should I choose?” 

Well, if I were you, Confusio, I would choose the two that make my life the easiest. 

Since (7) is a first order differential equation, it is our clear-cut favorite. Of the two second 
order differential equations (5) and (6), the latter is clearly simpler to solve and hence is 
the more sensible* choice. Indeed, we now see there is no point in even differentiating, as 
we rather stupidly did in (6). The original form £ (r?) = 0 yields immediately rt =a 
an unknown constant. Inserting this into (7) gives 


2 2 
dr a 
a) tame : 


Indeed, you might have realized that, in terms of a mechanical analog (interpreting / 
as time), we have just used angular momentum conservation to eliminate 47 in (7), so 
that (8) simply describes a particle with mass = 2 and energy = 1 moving in a repulsive 
“centrifugal” potential V(r) = a 

Integrating (8), we find r? = /? + a”, where we absorbed an integration constant into / 
by setting / = 0 when r =a. Integrating o = pg We obtain a tan(6 — 6) =/. A nice 
exercise in elementary geometry shows that this indeed describes a straight line. 

The point of this is not to show that the author is capable of solving coupled differ- 


ential equations and to obtain a rather complicated description of a straight line, but to 


* As an undergraduate, I once took a math course for which the final exam consisted of ten problems, of which 
we were required to do only one. The ability to recognize which problem is doable is an extremely valuable skill 
in theoretical physics, and presumably in math also. The professor later told us that one of the ten problems had 
not yet been solved, and he was hoping that some bright undergrad would solve it “by chance.” 
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understand that a straight line can apparently take on quite different forms in different 
coordinate systems. This will be a recurring theme in Einstein gravity. 

Just to whet the reader’s appetite, I might mention that when we study the motion of 
particles around a black hole, we will encounter the same type of equations as (5) and (7). 
Here we made our lives sweeter by favoring (7) over (5). When we get to Einstein gravity, 
we will use this “trick” again and again, namely tackling the analog of (6) first. 


Great circles 


We continue with a slightly more difficult problem: find the geodesics on a sphere. With 
the usual spherical coordinates, the length of a curve on a sphere is given by 


de 2 2 Ag 
| Va02 + six? ode ah dh (F) iene =| diL (9) 
hi dh dh h 


i 


Here we set A to / without further ado and obtain 


ao dy\’ 
“~ —singcos9(") =0 (10) 
dl2 dl 

and 
d d ad 2cos6 dd 
= (sin*o St) =0 5 “#, 20-0 (11) 
dl dl di" sin@ dl dl 


Since L does not depend on g, the equation for ae is particularly easy* to obtain. 

We all know that on a sphere great circles are geodesics. Indeed, we see that y constant 
with 6 a linear function of / is a solution. Lines of fixed longitude are great circles. In 
contrast, 9 constant is not a solution unless that constant is 7/2. Flying along a fixed 
latitude is not a fuel-economizing move’ unless you are at the equator. 

As in the simpler flat space problem above, we should remember that we have a third 
equation 


2 2 
do d 
(<) + sin? 0 (<*) =1 (12) 
dl dl 


and just as in that case the reader could differentiate this equation and verify that it is not 
independent of (10) and (11). (The reader could also see that if / is interpreted as time, (12) 
expresses the conservation of energy in the motion described by (10) and (11).) 

With what you have learned here, you could now immediately determine the geodesics 
on the Poincaré half plane discussed in chapter I.5. Try doing it, and then look at appendix 4 
if you need help. 


* The analog of this will also hold in Einstein gravity, associated with the notion of Killing vectors (see 
chapter V.4). 
T As you will see, this foreshadows the discussion in chapter V.1. 
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The geodesic equation 


After these three examples, we are now ready to formulate the general case. The strategy 
is now familiar: we do what the ants mentioned in the prologue do. Without the benefit 
of advanced math, they wander off from some “tried and true” path, hoping to find a 
better one. 

Consider a space with the metric g,,,(x) and a curve X“(A), where A is, once again, 
any parameter that varies monotonically along the curve. (As in chapter I.5, we use Greek 
indices to label the coordinates, and repeated indices are summed over.) Once again, we 
deftly avoid the notational confusion inflicted on the reader by some books. Beware of the 
distinction between x and X! We are to minimize the length of the curve fixed at some 
initial and final positions: 


fa-| fasdXra? = f dry] X0y = fat (13) 


where g,,, is evaluated at X of course. (At any point, if you are confused, you could always 


refer to the example of the sphere in (9), for which ggg = 1, gg, = 0, and gyy = sin’ 0.) 
Here we could have simply plugged this functional of X“(A) into the general equation 

(1), but in the interest of pedagogy, it is worthwhile (and easy enough) to go through the 

Euler-Lagrange steps again. Setting the variation of { dd L to zero, we obtain 


pf dre=f arse 


= fa. (v0) +0 SE d(X" + 5X") (imran) 


di di da dh 
1 dX" d8X” dX" dx” 
= [ut (26, A aan t buy ae ax°) =) (14) 


(If you are confused, read the paragraph following (4) again.) 

We emphasize that g,,, and 0, g,,, are evaluated at X (A). Indeed, the second term arises 
from the dependence of g,,, on X. Integrating the first term by parts (with 6X° = 0 at the 
endpoints as usual), we obtain 


d 1 dx" dX" dX” 

2 =0 15 
an (G BYE aa. ) COR Ae ahh (P) 
We have learned only too well that to simplify (15) we should exploit the freedom in 


choosing 4 and use length parametrization. Set dd to di so that L = 1. Then we have 


d dx" 1 dX" dx” 
( ) = (16) 


al \8H" a] i ate > 
where, to repeat, g,,, and 0, g,,, are to be evaluated at X (A). (For example, for the sphere, 
for o = 6 we have, noting that the metric is diagonal, 4 (g66 eo) — 1p By 9( 4) = 0, which 
is just (10).) 

In specific cases, rather than pushing through the differentiation in the first term in (16), 
we are better off restraining ourselves. The geodesic equation in this form corresponds to 
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the first halves of (6) and (11). In particular, if g,,,, does not depend on x”, the second term 
in (16) vanishes, and we learn immediately that Sno is constant along the geodesic 
curve. 

For further analysis, however, we should, and could, push ahead and carry out the 
differentiation in the first term of (16), just as in the second halves of (6) and (11). We obtain 


2. v v , “4: : 
Bis — + Op ne tad 2a 5 Ss d ax 4” = 0, which on multiplication by g°” becomes* 
re Came dX" dX” 
po 
di + hd (20,816 _ Io 8 uv) dl =0 (17) 
Defining the Christoffel symbol _ 
1 oo 
Thy) = 38 (x) (8,8r0(%) + Iv8pyo(X) «| Ig Suv X)) (18) 
we can write (17) as 
d*Xx? dX" dX” 
+1? (XO) =-—— =0 19 
IP A) (19) 


Note that I"), is defined to be symmetric in jv, since in (19) it multiplies the symmetric 


dX" dX” 
es ahivetion@ Pa are 


It is often useful to deal with the auxiliary quantity 


Pave = 5 ud ue + Wv8u0 = Io 8uv) (20) 


We put a dot in the group of three indices to remind us that this object is symmetric 
under the exchange of the first two indices, and that the last index o is the one who came 
downstairs, and when needed, could be sent upstairs again. 

Comparing (19) with (5) and (6), you could read off the nonvanishing Christoffel symbols 
for polar coordinates: 


Tij=—-r and 1% = : (21) 


(Note the factor of 2 in (5) disappears because of the symmetry of the Christoffel symbol 
in its two lower indices.) Similarly, you could read off from (10) and (11) the nonvanishing 
Christoffel symbols for the sphere: 


cos 6 


eh =-—sin@cos@ and a = (22) 


sin 6 
You should verify that you obtain the same results plugging the metric into (18). 

Decide for yourself which method of calculating I’, is computationally simpler for 
you. I prefer to vary (13) directly to using (18). With practice the variation could be done 
mentally. For really simple examples, such as the ones here, it is pretty much a toss up. 

Note that the geodesic equation (19), when written in terms of the tangent vector 
Vr= ue to the curve, actually looks quite simple: 


a+r, vey’ =0 (23) 


* Recall from chapter I.5 that g°’ g,,5 = sf. 
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Keep in mind also that often it is easier to solve (16), which can now be written as 
d “ 1 es 
Frac tad ) ~ 7 o8uv)V vy =0 (24) 


I have intentionally derived the geodesic equation in a slightly cumbersome way, drag- 
ging L along, to emphasize that the length (13) is a geometric quantity independent of 
the parametrization A used. Only at the end do I exploit our parametrization freedom and 
set A to / to obtain (16). I will do this throughout. In contrast, the authors of some texts 
vary an alternative quantity { dl Suv a ix” (which is emphatically not parametrization 
independent) and use the Euler-Lagrange equation (1) to arrive at (16) in one step. The 
derivation looks cleaner but comes with the cost of potential conceptual misunderstand- 
ing. Once you understand all this, you can, however, safely use this method. At the least, 


it serves as a useful mnemonic. 


Connection to classical differential geometry 


At this point Confusio suddenly speaks up. “Hey, didn’t we encounter the Christoffel 
symbol already back when we discussed classical differential geometry in chapter I.7>” 
Excellent! Confusio has been paying attention and is not as confused as he looks! But 
what is the connection? In chapter I.7 the Christoffel symbol appeared in the variation of 
the basis vectors 0,,é,. Here it has to do with the first derivative of the metric. 
In fact, you already encountered the key in exercise 1.7.6, namely that 


Iu8ve aa Pvc + Pigs (25) 


Applying a lemma given in chapter I.4, we can invert (20) to obtain precisely this. 


A straight line does not have to look straight 


Professor Flat ambles by again, saying, “It would be enlightening to give an alternative 
derivation of the geodesic equation (19).” 

“Let’s guess,” you and I reply in unison, “We go to locally flat coordinates, right?” By 
now, we are catching on. 

PF: “Excellent! We all know that in a locally flat region the geodesics are just straight 
lines, from stretched linen, you know.” 

It sounds vaguely plausible; Professor Flat is also an amateur etymologist. 

So, consider locally flat coordinates y? (x), related to our curved coordinates as indicated. 
The metric is ds* = 5,,,dy?dy° = g,,,(x)dx"dx” so that 

dy? dy? 


Bu) pe a ai Ox” 


(26) 


. : . : yP . 
A straight line is a curve y°(s) for which the tangent vector a does not change, that is, 
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Professor Flat interjects: “That’s what ‘straightforward’ means: you keep moving for- 


ward. This derivation will be straightforward also, just plug and chug. We replace y by x 
2 
in the simple equation ix 


So, 22 — ay? dx* 
> ds ~ dx ds’ 


= 0 to obtain a complicated looking equation!” 


and then 


_ d*y? __ dy? d*x* a2y? dx! dx” 


= = 27 
ds* —dx* ds? axax” ds ds (27) 
Multiplying by x (and using Ta ay = 6°), we obtain (after renaming the index o) 
az xr Xr a2 P dx dx” 
bs Ox y xi dx" 0 (28) 


ds2 dy? dxHOx” ds ds 
2 
As promised, we manage to replace the simple equation ie = 0 by the complicated- 
looking* (28). 
Upon identifying 
» _ ax* Arye 


HY Ay? axHax” 


(29) 


we recognize (28) as just the geodesic equation (19). 

PF: “See how easy it is? We just have to show off our mastery of the chain rule!” 

A straight line does not have to look straight in curved coordinates. Rather, itis described 
by a curve determined by (28). Nothing wrong with the straight line, but you have chosen 
funny coordinates. A bit like the fun house mirrors in amusement parks. 

In this formalism, how do we relate the Christoffel symbol in (29) to the first derivative 
of the metric, as given in (18) and (20)? The way to do this is as follows. Differentiating! 
the metric in (26) gives 

O?y? ay? | ay? ay? ) 
Ox*OxH Ax” = Ax Ax4*dx” 


I,8uv(X) = Spa ( 


a2yP ay? 
PO axkaxk Ox” 
Lowering an index in (29) by using the metric in (26), you obtain I’,,,., as given in (20). 


Using the identity (1.4.14) you derived in chapter 1.4, you could solve for 6 


Appendix 1: Drowning in a sea of indices 


The appearance of the 3-indexed Christoffel symbol I’? has detonated the explosion of indices that gives Einstein 
gravity its (undeserved) reputation of being difficult, and the reader is hereby warned that it will get much worse in 
subsequent chapters when the 4-indexed Riemann curvature tensor! appears. On first exposure, some students 
could easily feel that they are “drowning in a sea of indices.” Some sophisticated types favor a fancy-schmancy 
index-free notation! This is analogous to the vector notation v that you are fluent with, instead of the index 
notation v’. But it takes considerable effort to learn the index-free notation, and when push comes to shove, in an 


* This peculiar replacement of a simple equation by a complicated one foreshadows Einstein’s deep insight 
about gravity. See the discussion of the equivalence principle in part V. 

t The astute reader might sense that the appearance of 3- and 4-indexed objects was foreshadowed by the 
expansion 2,,,(*) = 8,)(0) 4 Auy,aX" } BuyroX?X° + ---+we used in chapter I.6 to go to locally flat coordinates. 

+ Which we will eventually get to in chapter IX.7. 
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actual calculation,* even a sophisticate might have to descend to indices. Besides, you have to learn to walk before 
you can fly, and I think that for a first introduction to Einstein gravity, grappling with indices is an essential and 
ennobling experience. 

Different people have different ways of remembering how the indices go. My advice is to remember the 
symmetry properties of various objects we encounter, for example, that rm, is defined to be symmetric in wv. I 
remember the schematic form I’), ~ g’‘d.g.., which enables me to reconstruct the precise expression (18). 


Appendix 2: How the Christoffel symbol transforms 


The beginning student should skip this appendix. Although it contains a result we will need later, it also contains 
“a sea of indices”! 
Knowing how g,,, transforms under a coordinate transformation (1.5.14), you could immediately plug that 
transformation law into (18) to determine how Te transforms. It is easier, however, to use (29) and grind ahead. 
Following Professor Flat’s suggestion, we went from the locally flat coordinates y to the coordinates x, 
obtaining the ba in (29). What if somebody comes along and changes from y to x’? She would obtain some 


other Christoffel symbols ae How are they related to our symbols? 


¢ 92,0 ; ‘ . a A 
To relate ioe = ot i ., we simply replace the derivatives with respect to x’ by those with respect to x. The 


dy? ax/Hax”? 
calculation looks messy, but hey, it’s only the chain rule once again. 
Ys Ys ‘Y 8 
p p 
Start with = = oe ar , and so 


a*yP ax® a (5) - ax” (= 2) 
ax/ax” — axl ax? \ ax” Ox’ ax? \dx™ ax? 
Ox® ax? (; a*yP )s ax® ax? (25) 
~ ax’# ax” \ axeax? ax’! Ax®dx!” \ Ax? 


The other factor in ee is much easier to deal with: we - es ae. Putting it together, we obtain 


Fie ax’ ayP ax" ax” {= (= a*yP ) on Ox? (= mall 
av aye ax'Hax” — ax7 ax’! | ax” \aye axeaxe J | axeax” \aye axe 


ax” ax® ax? ae ax gx 


~ ax axl ax” °F" ax axHax!Y 


dy? dx? 

The Christoffel symbol does not transform as a tensor, and hence, is not a tensor! 

Referring to chapters I.4 and I.5, we see that the first term in (30) is precisely what is needed for the Christoffel 
symbol to transform as a tensor under the coordinate change x — x’, but that is spoiled by the presence of the 
inhomogeneous term a ai To repeat, the important point here is that the Christoffel symbol is not a 
tensor. 

That the Christoffel symbol carries three indices and that it fails to transform nicely like a tensor are among 
the two root causes of some technical complications of general relativity. We can handle something carrying two 
indices: a square array of numbers may naturally be treated as a matrix, but not a cubical array. Note that we did 
not go looking for this 3-indexed beast; it came looking for us. 

Using the notation S4, = axe ; (sy, = ax, ,and ar, = So) ie 0, introduced in chapters I.4 and I.5, we could 
write (30) in a slightly nicer form: 


where we used ( ) = 6” once again. 


Pass) St SeS DS"), (31) 


Appendix 3: Finding locally flat coordinates 
Professor Flat is getting visibly more and more excited. We gently inquire why. 


* See the story in endnote 23 to the preface. 
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PF: “Don’t you see? Given how the flat coordinates y depend on the curved coordinates x, you can calculate 
I’. Now just reverse the math. Suppose you are in a curved space at some point P with coordinates x,,, and you 
know I at that point, then you can reverse the steps above and find the desired flat coordinates y.” 

In chapter I.6, I showed you by simple counting that it should be possible by changing coordinates to make 
the region around any point locally flat. But I didn’t show you how to do it explicitly. Now we see that all we have 
to do is to solve (29) for y. The solution is 

yP(x) = KAA — 2d) + STA ee" — 22a" — x2) +4 (32) 
with the shorthand K 4 = a |,, and with the dots indicating corrections of order (x — x,)>. We simply insert 
this into (29) and verify that we recover re ,(*,)- (You might want to compare with the discussion in chapter I.6.) 
Of course, after obtaining y as given by (32), we still have the freedom of applying an arbitrary rotation followed 
by a translation without affecting (26) and (29). We didn’t include these to avoid cluttering up (32). 

To see how all this works, it is best to go to an example. Imagine you are living at latitude* 0, and longitude ¢,,. 


Since the metric does not depend on g, we could set? y, = 0, but you might want to drag it along nevertheless. 
Simply plug into (22) and (32). 


Appendix 4: Geodesics on the Poincaré half plane 


Let us now find the geodesics on the Poincaré half plane defined by ds? = (dx? + dy*)/y? in chapter 5. First, 
you may wish to recall the discussion there. We concluded that, to go from one point to another, we would want 
to curve away from the edge at y = 0, but we were not able to determine the actual curve. Now yes, we can. We 
are to minimize 


B 2 2 
D =i aes (33) 
A y 


Rudolf Peierls famously said2 to the young Hans Bethe, “Erst kommt das Denken,? dann das Integral.” 
(Roughly, “First think, then calculate.”) Now that we have done the thinking in chapter I.5, we could simply plug 
into the geodesic equation derived here. But wait, how about a tiny bit more Denken? Let’s exploit parametrization 
invariance and choose the parameter that would minimize not only the integral but also our labor. 

So, what did you choose? Inspired by your solution of the brachistochrone problem in exercise II.1.5 when you 


went mano a manowith Isaac Newton, you chose y as the parameter! Then* D = na dy,/1+ (2 )*/y? = fe dy L. 


The key observation is that with this choice the integrand L is independent of the “dynamical” variable x. In other 
words, the Euler-Lagrange equation simplifies: 


dx 
d (24) =H =0 d ay 2 4) 


d. 
dy 8 Ox dy y + (442 


We obtain immediately a =F /1+ (= )2, with b an integration constant. This elementary differential equation 


is solved by x — x, = +,/b? — y?. The second integration constant x, reflects the translation invariance in x. 
The geodesics are semi-circles of radius b centered at any point (x,,0) on the x-axis. (The + sign we 
encountered in the solution corresponds to the two halves of the semi-circle.) Note also the “vertical” lines 
x = constant also solve the geodesic equation. See figure 2. 
Thus, the geodesic going from point A to point B could be determined by a geometric construction. Draw 
a circle centered on the x-axis going through the two points. The circular arc between A and B is the desired 
geodesic, in full accordance with our intuition in chapter I.5. 


* We mean physicists’ latitude, of course, with 6 = 0 at the north pole. 

+ The story of France’s losing battle to set y = 0 for Paris makes for an interesting bit of history. Fortunately, the 
British were not so arrogant as to set the latitude of Greenwich to 0 also. More recently, GMT, meaning Greenwich 
Mean Time, was replaced by the compromise term UTC, an acronym neither in English nor in French. 

+ In fact, you can see that this problem bears a striking resemblance to the brachistochrone problem but is 
not at all the same. 
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Figure 2 Geodesics on the Poincaré half plane. 


Of course, if you chose the length / along the curve instead of y as the parameter, you could still solve the 
problem readily. See exercise 4. After all, Euler and Lagrange already did the Denken for you. 


Appendix 5: Coordinates from a family of geodesics 


Given a space with a coordinate system and a metric, we can determine the geodesics. Conversely, a family of 
geodesics can also lead us to a “natural” coordinate system. 

Consider 2-dimensional space and a family of geodesics. We restrict ourselves to a region in which the 
geodesics do not intersect. (The coordinate system fails where the geodesics intersect. For example, the usual 
spherical coordinates on the sphere are not well defined at the north pole, where lines of constant longitudes 
intersect.) Label each geodesic by the parameter 6 continuously. In other words, each value of 6 uniquely specifies 
a geodesic, and values of 6 infinitesimally close to each other specify geodesics that neighbor each other. On each 
geodesic, mark off the distance / and call this the coordinate r; in other words r = 1. This is equivalent to saying 
1= yy dt a = Sr )? = g,,: the first equality is the definition of /, the second follows from our construction 
that 6 is constant along each geodesic, while the third is due to the choice r = /. Hence we have g,, = 1. 

Next, let’s see if we can get rid of the cross-term drd@. Change coordinate by setting r =7 + h(0) so that 
dr = dr +h’'(@)d0. In the new coordinates (7, 0), we have the cross-term 2(g,9(r, 9) + h'(@))drd0. Of course, 
8,(r, 9) can be equivalently written as some function k(7, 0). Thus, in general, we cannot get rid of the cross-term 
by suitably choosing h(@). 

But we are able to get rid of the cross-term at a specific value of 7, call it 7p. In other words, we choose h(9) so 
that h'(0) = —k(F, 0). 

To proceed further, we will be kind to ourselves and drop the tilde sign, that is, we rename 7 and call it r. To 
summarize, at this point we have ds* = dr? + 2g,gdrd0 + gggd02, with g,o(ro, 0) = 0. 


* * . * ss * dr r dx! dx” r 
We now plug in the geodesic equation for the coordinate r and obtain 0 a lw at dl lee The 


second equality is the geodesic equation, while the first equality follows from r = /, the third from the fact that 
along a geodesic 0 is constant andr =1. Thus, we have learned that’, = 5a"? (29,.8,9) = 0. Here the first equality 
follows from g,,. = 1. So, either g’’ = 0 or 4,.,.4 = 0. In the first case, the 2-by-2 matrix g/” is diagonal and so its 
inverse is also diagonal, meaning that g, = 0. In the second case, 3,8, = 0. In other words, g,g does not vary 
as r varies. But we already know that g,g(ro, 0) = 0 for some ro, and hence we can conclude that g,,(r, 0) = 0 
for all r. 

To conclude, for 2-dimensional spaces, we have found, at least in some region, a class of coordinate systems 


with the metric 
ds? =dr? + F(r, 0)d0” (35) 


Note that in general we cannot simplify further. 

You know some examples from this class. In particular, consider the subclass with metric ds? = dr? + f (r)d6?. 
Two examples are ds? = dr? + r2d6?, polar coordinates for the plane, and ds? = dé? + sin? 6dy, spherical 
coordinates for the sphere. 


Exercises 


1 As explained in the text, to solve for the geodesics on the sphere, we could choose two of the three equations 
(10), (11), and (12). The wise person chooses (11) and (12) (of course!). Solve (11) immediately and plug into 
(12) to obtain 
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2 2 
dé K 
—— + =1 36 
( dl ) sin’ 6 ae 
where K is an integration constant. Show that this equation could be interpreted as that of a particle moving 
in the potential V(6) = *. Discuss. 


Show that for diagonal metrics, at most one of the three terms on the right hand side of (18) defining the 
Christoffel symbol actually survives. 


Show that (18) and (20) follow from (29). 
Determine the geodesics on the Poincaré half plane. 


Use the transformation law (30) of the Christoffel symbol to determine I, and 1%, in polar coordinates. 
Hint: Consider the coordinate transformation from (x!, x”) = (x, y) to (x, x”) =(r, @). 


Instead of varying the integral f da,/g,,.,(X(A)) aX" dX" in (13), we could, as an alternative, vary the integral 
Sf drgyy(X@) axe ie Show that this leads to the same geodesic equations as in the text. Although this 


approach is arithmetically simpler, it is, as explained in the text, conceptually opaque, since this integral is 
not parametrization invariant and does not represent a geometric quantity, such as the length of the curve. 


In chapter 1.6, in proceeding to locally flat coordinates, after the first step, with the metric already in the form 
Spyv(X) =8yy + Ayy aX” +++ +, we claimed that by using the transformation x“ = x/# + LY x’"x"* +--+, we 
could get rid of the linear terms in the metric. Using the transformation property of the Christoffel symbol, 
determine L",, . (Just from the index structure, you could probably guess the answer.) 


Find the equation for geodesics in conformally flat spaces (which, as you might recall, were defined in an 
exercise in chapter I.6). 


Notes 


1. A tiny bit of subtlety gets glossed over here. See S. Weinberg, Gravitation and Cosmology. 
2. QFT Nut, p. 365. 
3. As in gedanken experiment. 
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The action principle 


You realize that in the last couple of chapters, I have been setting you up for a return to the 
prologue—to Fermat with his least time principle and to Feynman choosing the right path. 

Next time you are invited to a dinner party at the home of a philosophy professor, say the 
word “teleological” in the middle of the main course. After these guys have stopped clawing 
at each other, utter, with nonchalant total self-assurance, “The ontological is distinct from 
the epistemological, while the tautological is antithetical to the logical,” and watch the fun 
start again. That statement is of course what is known in polite circles as “utter nonsense” 
and in less polite circles as total BS, but it gives you an idea of how some academics talk. 

The philosophies-R-us version, which I could give you for no charge, is that things are 
teleological if they have a purpose, or at least act as if they have a purpose. That’s a big 
no-no in Western science. You see, Fermat’s least time principle! has a strongly teleological 
flavor: that light, and particularly daylight, somehow knows how to save time—a flavor 
totally distasteful to the post-rational palate. In contrast, at the time of Pierre Fermat (1601 
or* 1607/08?-1665), there was lots of quasi-theological talk about Divine Providence and 
Harmonious Nature, so there was no question that light would be guided to follow the 
most prudent path. 

Thus, after the success of the least time principle for light, physicists naturally wanted to 
find a similar principle for material particles. Something is minimized, but what? Matter 
appears to behave quite differently from light, and this puzzled physicists for centuries.t 


* The heavy academic controversy over Fermat’s birth year stems from his father marrying twice and naming 
two sons from two different wives Pierre.” 

T See the discussion in chapter III.5. The ultimate resolution had to wait until the advent of quantum field 
theory, but that’s another story I’ve told elsewhere. 
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From the hanging string to the falling apple: Time makes its grand entrance 


Now that you have mastered variational calculus, you can easily figure out whether New- 
ton’s F = ma follows from a variational principle. Consider the falling apple. Newton 
taught us to determine the path q(t) of the apple by solving 
2 
1 = ng (1) 

The question is: what variational principle can give us this differential equation? 

I certainly dropped a trail of hints when I discussed the variational calculus. Recall (11.1.1) 
and (1I.1.4): we saw that the Euler-Lagrange equation for a hanging string 


d* 


follows from extremizing the energy functional 


7 Tflde\ 
E(¢) = / dx (F(2) a) (3) 


At this point, you don’t have to be a genius to see that (1) and (2) have the same form. 
Simply replace x — t, @ > q. Pretty nifty, eh? Remarkably, we simply flip figure II.1.1 over 
and relabel the horizontal axis by ¢ instead of x, to obtain figure 1. 

Voila, from the hanging string to the falling apple by replacing space with time! In other 
words, extremizing the integral S(q) = dt{tm(44? — mgq(t)} gives us (1). Thus far, in 
part II, we have had no time. Now time makes its grand entrance. 

More generally, we have discovered, with no further ado, that extremizing 


7 1 dq : 
sq) = f at { (4) - vel (4 
yields the equation of motion 


dq ; 
m a =-V(q) (5) 
for a particle moving in a potential V. The integral S(q) is known as the action. 
The action S(q) is a functional of the path q(t). With each path, we assign a real number, 


namely the action S(q) evaluated for that particular path q(t). We are then instructed to 


— ft 


Figure 1 To go from the hanging string to the 
falling apple, we simply flip figure 11.1.1 over and 
relabel the horizontal axis by t instead of x. 
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vary S(q) subject to the boundary conditions (which, if we feel pedantic, we should perhaps 
refer to as the initial and final conditions, since we are dealing with time) that qg(t;) = q; 
and q(t) = qf. The particular path q(r) that extremizes the action S(q) satisfies Newton’s 
law (5). 


The Lagrangian 


The action principle states that the movement of particles is determined by extremizing 
an action. In general, the action functional is given by 


S(q) = i dtL G a) (6) 


The quantity L( “4 , q) is known as the Lagrangian. The action is to be varied with the initial 
and final positions q(¢;) and q(ty) held fixed to some q; and qf, respectively. 
The variation of the action vanishes if 


d { 6L éL 
£(--)-==0 (7) 
dt \s4)}  8q 
To be precise and pedantic, I stress, as in chapter II.1, that the notation means the 
following. We pretend that L(a, b) is an ordinary funenon of two variables a and b. yy 


at , we mean oa.P) with a subsequently set ei to 44 4 and b to q(t). Similarly, by 54 


dt 


we mean BEG.P) by 


with a subsequently set equal to 4 ot and b to q(t). 
Switching ‘Roi Leibniz’s notation to Newton’s notation, we could write the Euler- 
Lagrange equation (7) in the elegantly compact form 
dL éL 
ae (8) 
q 64 
At Princeton University, there is a church-like gothic building with stained glass win- 
dows, each of which is inlaid with a fundamental equation of physics. One of the equations 
is (8). This equation, suitably generalized to quantum fields, underlies all known dynamics 
(we will come back to this shortly) in the universe. 


A minus sign 


In nonrelativistic mechanics, the Lagrangian “L(4, q) has the additive form 


dq \_1_ (dq\’ 
t (4) =5m (4) ¥@ 0) 


(but that is not necessarily the case in general). Notice that the Lagrangian is equal to the 


kinetic energy minus the potential energy. In a way, there is a “what else could it be?” 
quality to what the Lagrangian turns out to be. 

The minus sign is needed to generate the correct equation of motion (5), as you saw. 
In the energy functional (3) for the string, the variation of ¢ in space adds to the potential 


Il.3. Physics Is Where the Action Is | 139 


energy in the energy functional. In contrast, in the action functional (4), the variation of 
q in time goes against the potential energy: the crucial minus sign. Consider the simple 


harmonic oscillator, for which the action would be S = [ dt[4m(44)° - 5kq?]. This is a 


first hint that time and space work against each other. Time differs from space by a sign, 
so to speak. 


Choosing a path as a metaphor for life 


Let us see how the action principle works ina specific case. Consider an object falling froma 
height of h to the ground in time T. The Lagrangian is then L(4 Q= Im(“4)2 —mgq(t). 
To keep irrelevant symbols from cluttering the page, let us choose units so that g = 1. Then 
dq _ 
= 
and q(T) = 0. Dear reader, I know that you can solve this equation practically in your sleep, 


Newton's ma = F reads —1to be solved with the initial and final conditions g(0) =h 


thus finding the familiar parabola 


a=ne (E- 2) 0-3 (10) 


Note one important difference between Fermat’s least time principle governing light 
and the action principle governing material particles. In the least time principle, we are to 
minimize, duh, the transit time. In the action principle the transit time T is specified in 
advance. The particle is required to get to its destination in time T, the ultimate “on-time 
company.” Notice that in the particular solution q(t), the coefficient of the term linear in 
t switches sign when T becomes larger than 2h: Given too much time, the particle has 
to “waste” some time moving up before coming down. 

Just as in the discussion of the hanging string following (II.1.9), the potential in this 
particular case is linear in g, and so, under q(t) > q(t) + n(t), the second order variation 
of S is given by Im (41)? and thus is manifestly positive. The path in (10) in fact minimizes 
the action S. This is not true in general. (See exercise 1.) A simple calculation shows that, 
for the path actually followed, namely the path in (10), the action is equal to (henceforth 
we will factor out m and drop it) Swin(g) = Ls - Le - es 

It is intriguing that in the least action formulation of mechanics, the particle gives the 
impression of choosing a path to follow, as if it could sample all possible paths* before 
finding the one with the minimum action. “Mmm, this path is no good. Let me try another.” 
Let’s have fun seeing how this actually works. 

What does the particle decide to do? Should the particle sit around and suddenly make 
a mad dash? Or should it get going promptly and then coast to the destination? What sort 
of “personality” does it have? 

Perhaps the particle does something in between. Suppose that it decides to just forget 
about Newton and to fall down with constant speed, that is, following the path q(t) = 


* This is a profound statement. To some extent, one could argue that the existence of the action principle in 
classical physics foreshadows the quantum world. See “Local versus global” later in this chapter. 


140 | II. Action, Symmetry, and Conservation 


h— tt. A steady kind of guy going at a steady pace. You could check that the action is then 
S(qy) = i - oo indeed more than Sin(q) given above. 

In everyday life, a falling object, especially if it is fragile and valuable, appears to hesitate 
for amoment or so, almost as if it is saying “Catch me if you can!” before gathering speed 
and crashing to the floor. That’s Galileo’s law of acceleration in action of course. From the 
action point of view, you can understand what is going on. The object, by staying at high 
altitude for “as long as possible,” maximizes its potential energy and thus lowers the action. 
But then it has to rush at the end to get to the floor in the allotted time T, and hence pays the 
price of a larger kinetic energy. You could easily compute that the particle, by choosing the 
actual path q(t) rather than the alternate path q,(¢), pays an extra time-integrated kinetic 
energy equal to ae but it also raises its time-integrated potential energy by 2G); thus 
managing to lower its action. It has figured out how to get the best action deal. 

Some would see in the action principle a metaphor for life. You want to live life maximiz- 
ing something, perhaps the total time-integrated happiness. You could either party now, 
dude, or you could study the action principle and party later in life. Of course, physics is 
so much simpler than real life, for which the quantity corresponding to the Lagrangian 
consists of a multitude of terms, each with zillions of parameters that vary from individual 
to individual. For example, for some geeks, studying physics has got to be way more fun 
than partying. There is also the minor detail that T is not known in advance. 


Newton versus Aristotle 


I have restricted the discussion to a single particle moving in 1 dimension. As in chapter 
1.1, through the magic of indices, we can immediately jump to the case of many particles 
moving in any dimension you desire. For instance, for a particle moving in 3-dimensional 
space, the position of the particle is specified by a 3-vector q(t). By rotational invariance 
(that is, “no direction is special”), the Lagrangian Lee, q) must be a scalar. This pretty 
much restricts it to have the form 


dg -\_1_ (dq@y’ 3 
(44,4) = 3m (4) Val) (11) 


where (£2)? represents the dot product “4 - “ of course. You might also think of addin 
dt P P dt" dt 8 g 


the term q - a. but this is equal to 5 La and so contributes to the action S = { dtL merely 


a boundary term 54 *(T) — q7(0)). The equation of motion is not affected. 

Aristotle thought that force is equal to velocity. It took the full genius of Newton to point 
out that no, force is in fact equal to acceleration. That, F = ma, is certainly one of the 
least obvious ideas in the history of theoretical physics: just try explaining it to a medieval 
peasant struggling to keep his cart moving. 

When you first studied physics, didn’t you wonder why Nature went against Aristotle 
and chose acceleration, and not velocity? Here is the answer. In some sense, rotational 
invariance and the action principle dictate the form of the kinetic term in (11). The 
Lagrangian contains two powers of 4. No way one of these is to disappear on our way to 
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the equation of motion. It has to be a second order differential equation. This fundamental 
truth in fact continues to hold as we move onward to more advanced physics, all the way 
to the action of fundamental forces governed by quantum field theory. 

I don’t claim that this is a proof that would satisfy mathematicians. For example, we could 
have a term like (4 : 44) 4 in the Lagrangian, but this sure would not lead to mechanics 
as we know it. 


Local versus global 


Newton’s equation of motion is described as local in time: it tells us what is going to 
happen in the next instant. In contrast, the action principle is global: one integrates over 
various possible trajectories and chooses the best one. While the two formulations are 
mathematically entirely equivalent, the action principle offers numerous advantages over 
the equation of motion approach. We mention some interesting points here. 


1. The action leads directly to an understanding of quantum mechanics via the so-called Dirac- 
Feynman path integral* formulation. Indeed, the discussion here gives a premonition of 
the emergence of probability in the quantum world. Which path would the particle choose? 
Betting odds, anybody? 


2. Intriguingly, while the energy functional is unequivocally asking to be minimized, the action 
principle merely tells us to extremize, rather than to maximize or minimize, a functional. 
In exercise 1, you will show that for the harmonic oscillator, the actual path can correspond 
to either a maximum or a minimum of the action. Within classical mechanics, this would 
appear somewhat puzzling, at least to me when I was a student. In the Dirac-Feynman 
path integral, this fact emerges naturally: classical paths correspond to the stationary phase 
in the sum over amplitudes. I consider this insight to be one of the great triumphs of 
quantum physics. Unfortunately, this point is obscured in the more familiar Schrédinger 


or Heisenberg formalisms. 


3. The action principle gives a deeply satisfying and unifying understanding of conservation 


laws, as we will discuss in the next chapter. 


4. The fundamental interactions we know about—the strong, weak, electromagnetic, and 
gravitational—can all be described by the action principle.* As you will see in this book, 
the action principle provides a natural route to special relativity, electromagnetism, and 
Einstein gravity. The action, rather than equations of motion, furnishes the language of 
quantum field theory. For instance, in perturbative field theory, we can go directly from the 


action to Feynman diagrams without ever mentioning equations of motion. 


5. A practical, buta relatively minor, advantage is that since the action involves only first deriva- 
tives with respect to time, rather than second derivatives as in the equation of motion, it saves 


us computational labor in changing variables. An example is provided by Newton’s solution 


* Why this should be so represents a profound mystery. 
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for planetary orbits. In changing coordinates x, y > r, 0, we do not have to differentiate 
twice to get to (1.1.11); we could stop at (1.1.10), square, and add to obtain x? + 3? = 7? + 767. 


The action then becomes 
scr, o)=m far {3074 5+ «| =m f a | 30? +7202) + “| (12) 
2 r 2 7 


Indeed, what we are effectively doing is replacing dx? + dy? by dr? + r*d6*, or more 
generally, g,,,dxdx. (Note that we are using the notation of chapters 1.5 and 1.6, in 
particular, Greek indices.) Here S is a functional of two functions, r(t) and @(t), so that 
varying, we obtain two Euler-Lagrange equations: # = r6? — * and 4 (76) = 0. Note that 
the conservation of angular momentum pops out, without our having to derive (1.1.13) and 


stare at it to recognize its more compact form. 


6. Relating our discussion here to that in chapters 1.5 and I.6, we see that the action allows us 
to immediately formulate mechanics in any coordinate system and in curved space. Simply 


replace (11) by 


dq 1 dq" dq” 
Els =_ —_-*_|-Vv 13 
(44.4) i («nia Ai ©) (q) (13) 
as in (12). 


7. In the treatment in chapter I.1 involving equations of motion, the mass of the planet m 
drops out. In the action formalism here, this corresponds to m appearing merely as an 
overall factor for the action in (12). As I mentioned in chapter I.1, this fact will play a central 


role in Einstein gravity. 


A particle at rest will remain at rest 


For a pedagogical exercise, imagine that you and I were around in the early 18th century. 
How could we, doused with a liberal dose of hindsight, have developed the principle that 
governed the motion of material particles? What I will describe is not how it actually 
happened, but what we can readily imagine as how it could have happened, a sort of 
alternative physics history, perhaps in another civilization far far away in another galaxy. 

We have heard of Fermat’s marvelous principle, but material particles obviously can’t 
also follow a least time principle, since their speeds can vary according to their energies. 

Consider the simplest case of a particle moving in 1-dimensional space. To get started, 
suppose that there is no force and the particle just sits there: the couch potato problem. 
What is the particle minimizing? Obviously, it is keeping its kinetic energy as small as 
possible. 

Define the quantity S(q) = f dt} (42, which we will call the action. The integral over 
t is evaluated from some initial time, call it 0, to some final time 7. The particle starts 
from some initial position g(0) and ends up at some final position g(T). To ease writing, 
choose the coordinate such that q(0) = 0. We also impose g(T) = q(0) = 0. Evidently, S is 
minimized for “4 = 0, which when solved with the stated boundary conditions gives us 
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q(t) = 0. Indeed, we have successfully described, by design, a particle sitting at rest. We 
have solved the couch potato problem, and perhaps we can even ask for tenure. 


The law of inertia 


So far so good. Next, we are emboldened to ask what would happen if at time T, we require 
the particle to be in some other position* q(T) = Q. Is our action principle smart enough 
to tell us that the particle will get there with constant speed? 

Consider the path g(t) = gy + a(t). Evidently, a(t), subject to the boundary conditions 
o(0) =o(T) =0, describes deviation from “the straight and narrow path.” Plug “4 = 


OQ do: ; . 
7 + G, into our action: 
i 2 2 r t : 
s=5 f a(S pee) ee +70 f ao fn (S) 
2 Jo ai 2h T Jo dt 2 Jo dt 


2 T 2 
= me + u iS dt ue 
2T 2 Jo dt 
where in the last step we used the boundary conditions. This is obviously minimized by 


a. = 0, which together with the boundary conditions implies that o(t) = 0, and thus the 


actual path q(t) = at 
A small triumph: we have recovered the law of inertia! In the absence of an external 


force, the particle continues to move’ at a constant velocity 2. No force: no speeding up, 
no slowing down. 

When we first learned to solve Newton’s law, we were typically given the initial position 
q(t = 0) and the initial velocity “4 (f = 0), rather than the initial and final positions, as in 
the action principle. Of course we can also solve Newton’s law with specified initial and 
final positions. Mathematically, we have a second order differential equation, so in any 
case, we need two conditions to fix two integration constants. 


Getting the potato off the couch 


Let us now go back to the couch potato and see how we could nudge him into motion. Turn 
on the gravitational potential V(q) = mgq. Note that q is a vertical coordinate pointing up: 
larger g corresponds to higher potential energy. Our boundary conditions g(0) = q(T) =0 
imply that the initial velocity a4 (0) must be positive: the particle shoots upward and 
eventually falls back down. With a negative initial velocity, the boundary conditions could 
not be satisfied. 

The question is how to modify the free particle action S(qg) = [ dtm (4)? Indeed, by 
dimensional analysis, our desired action principle is pretty much fixed: in the integrand, 


* Compare 3:10 to Yuma. The problem there is that you have to get to Yuma, Arizona, at 3:10 pM. 
+ Which is of course a highly nontrivial statement that essentially got physics started. 
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we can either add or subtract the potential energy. As mentioned earlier, adding would 
just mean the integrand becomes the total energy. So we try subtracting and write S(q) = 
m f dt{3 (4b) — gq}. 

What the minus sign does is that now the particle can get a better deal on the action 
by moving to positive g, and thus lowers the action via the second term, compensating 
for the gain in the first term. We can easily estimate how high A the particle has to rise to 
get the best deal. Dropping the overall m, we have S ~ T(4)? — Tgh: as anticipated, the 
kinetic term wants the particle to stay on the couch (i = 0), while gravity urges it to go up 
(h > 0). Extremizing in h, we obtain the familiar h ~ gT?. 

Iam somewhat surprised that Newton didn’t discover the action principle. Indeed, I see 
the brachistochrone problem, with its flavor of least time, as a bridge between the least 
time and the action principles. Leibniz, Newton’s constant rival, apparently almost had it. 


The Hamiltonian 


After Lagrange invented the Lagrangian, Hamilton invented the Hamiltonian.” 
Given a Lagrangian L(q, q), define the momentum by p = 77 and the Hamiltonian by 
A(p,q)=pq-L@@ (14) 


where it is understood that g on the right hand side is to be eliminated in favor of p. 

Let us illustrate this procedure by a simple example. Given the Lagrangian L(g, q) = 
5mq? — V(q) in (9), we have p = mq, which is precisely what we normally mean by 
momentum. The Hamiltonian is then given by 


; F Pee ae 5 
H(p, 4) = pa - L(G, 9) = pa - sng? + Vg) = 5 + VQ) (15) 
where in the last step we wrote g = p/m. You should recognize the final expression as the 
2: 
total energy, namely the sum of the kinetic energy 4 and the potential energy V(q). The 


Hamiltonian represents the total energy of the system. 


Lagrange and Feynman 


We close with two small stories about two towering figures. 

Starting when he was 18, Joseph-Louis, the Comte de Lagrange (who, by the way, 
was born Giuseppe Lodovico Lagrangia before the term “Italian” existed), worked on the 
problem of the tautochrone, which nowadays we would describe as the problem of finding 
the extremum of functionals. A year or so later, he sent a letter to Leonhard Euler, the 
leading mathematician of the time, to say that he had solved the isoperimetrical problem: 
for curves of a given perimeter, find the one that would maximize the area enclosed. Euler 
had been struggling with the same problem, but he generously gave the teenager full credit. 


* We mention this here for future use in part III. There is of course a lot more to the Hamiltonian than given 
here, but this is all we need. 
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Later, he recommended that Lagrange should succeed him as the director of mathematics 
at the Prussian Academy of Sciences. 

Richard Feynman (1918-1988) recalled that when he first learned of the action principle, 
he was blown away. Indeed, the action principle underlies some of Feynman’s deepest 
contributions to theoretical physics. In particular, his formulation of quantum mechanics 
depends very much on the action. I have left a clue in one sentence in the text for the 
curious student on how quantum physics can be understood without writing down the 
Schrédinger equation. Here you have learned how to do classical mechanics without 
Newton’s equation. 


Appendix 1: Particles and fields: Each telling the other how to behave 


The action principle is now practically begging us to add the action (11) telling us how masses move in a 
gravitational potential to the energy functional (II.1.11) in chapter II.1 telling us how masses generate the 
gravitational field. 

First of all, let us generalize (11) trivially to incorporate a whole bunch of masses through the magic trick of 
indices: 


s = [ym 1 (ie) — oan (te) 
matter - a 2 dt a 


We have used the fact that the potential V felt by the ath particle is given by m,®, with the gravitational potential 
(x) evaluated at x = g,(t). (We now suppress the arrow on various vector quantities.) 

In chapter IJ.1, we learned that the gravitational potential @(x) is determined by minimizing (II.1.11) 
E(®)=f ax (sto (Ve)? 4 p(x)(x)), where the mass density p(x, t)=)°, mg0°(X — qq(t)) is given by a 
sum of spikes centered on each of the particles. But wait, this is an energy functional, not an action. We now turn 
it into an action by the simple expedient of integrating its negative over time (the overall minus sign is because 
E(®) is a potential energy): 


ee 3. (1. 2 
Seay = fafa x (0% + p(x) (0) (17) 


This is of course the infamous instantaneous action at a distance of Newtonian gravity: at time , the gravitational 
potential (x, t) is determined by p(x, t) at the same time. (Needless to say, you should distinguish between the 
two different uses of the word “action”!) You may have also noticed that I have already stealthily snuck a time 
dependence into the mass density p(x, t): after all, the location q,(t) of the particles changes with time. It is a 
free country: the particles are allowed to move around. 

Confusio: “Do we now add the two actions, Satter and S, 

No! 

We see that, after we substitute the expression above for p(x, t) as a sum of delta functions into the second 
term in Sgravity in (17) and then integrate over space, we obtain for this term 


- if dt i, Bx Y m,8*(x = galt) P(x) = — / dt )) m,®(qa(t)) 


But this is precisely the second term in Satter given in (16). So Confusio, if you add the two actions together, you 
would be double counting. 


Thus, to obtain the total action for this Newtonian world, we should merge, rather than add, Satter 2Nd Spravity- 


gravity, together?” 


The correct action is 


— 1 dda : / 3 1 2 
s= fa 3 a8 ( dt ) ORG 
= / dx (3 mds 4400] vin] (18) 
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The first term describes the dynamics of the particles, the second term describes the field ®(x, t), and the third 
term couples the particles and the field together. 

In Newtonian gravity, the field dictates how the particles move, and the particles in turn generate the field. 

If you think that the action (18) is an ugly and awkward mess, you are completely right. The gravitational field 
pervades space but has no dynamics in time, merely reacting instantaneously to the location of the particles, while 
the particles are treated as points. How these two issues are resolved comprises two magnificent achievements 
in physics. 

Dear reader, perhaps you have already seen a way of endowing the field with dynamics. By analogy with the 
particle’s kinetic term (aa which is quadratic in time derivative, you might add to the second term in (18) 
a term quadratic in time derivative also, thus changing (V®)? to — (SP)? +(V)?, with C some unknown 
constant with the dimension of length over time (that is, the dimension of speed) to make the dimensions come 
out right. Varying S and repeating the same steps that led to I].1.10, we see that the equation determining ® is 
changed to 


(-45+~) ®=Gp (19) 


Now © no longer reacts instantaneously to p at the same time t. No more Newton’s spooky action at a distance! 
The gravitational potential now has dynamics and takes time to propagate from one point in space to another. 
Indeed, in empty space, with p = 0, the solution to (19) has the familiar wave form (x, tf) = A sin(wt — k- X)+ 
B cos(wt — k - ¥), with w = C?k2. The unknown constant C measures the speed of propagation. 

If you indeed saw all this, congratulations, you are some kind of a genius. You had foreseen Lorentz invariance 
and predicted the existence of gravitational waves. We will return to this in chapter IX.4. 

But still, classical physics cannot remove the dichotomy between the field pervading space and the particles 
localized as points. Only by going into the quantum realm, where particles can be realized as excitations in fields, 
do we have a pleasingly unified description of Nature. This dichotomy between field and particle in fact provides 
one of the driving motivations* for quantum field theory. As you will eventually learn (but not in this text), in 
quantum field theory, the form a (222 —(V®)? in the modified action is equivalent to the statement that the 
graviton, the particle associated with the gravitational field, is massless. 


Appendix 2: The string in action and light cone coordinates 


In chapter II.1, the string is just hanging there limply, far from being an action hero. With the action principle, it 
is a cinch to make it spring into action. The displacement ¢(x) of the string segment labeled by x now depends 
on time and so has to be promoted to $(t, x). Denote by ¢ the mass per unit length, so that our little string 
segment has mass ¢dx. Add a kinetic energy term f dx} ee to the energy functional of the string given in 
chapter II.1. The action becomes 


s= far fax E (%) (3 @) osot.»)| 
ni fof | (2) -2@) +09] 


Note the relative signs between the terms: as we now realize, the energy functional defined in chapter II.1 should 

really be called the potential energy functional and as we learned in this chapter, the kinetic energy and potential 

energy work against each other in the action. In the second step, since the Euler-Lagrange equation of motion 

does not depend on the overall normalization of the action, we took out a factor of 19 and defined c = 7 and 
— 208 


g 
Indeed, you recognize that we give life to the string in the same way we gave life to the gravitational field in 
the preceding appendix. 
Varying @ and integrating by parts (once in time and once in space), we obtain the equation of motion 


a 64 & 1 
(sala) O38 re 


11.3. Physics Is Where the Action Is | 147 


One important point here is that c, has dimensions of speed and thus, as you might expect, controls the speed 
with which vibrations on the string propagate. 

The physics we want to emphasize here could be made more clear by turning off gravity, that is, by setting 
k to 0. Then the equation of motion for the string has the general solution ¢(t, x) = fr (cst + x) + fr(cst — x), 
where f,; and fp are two arbitrary smooth functions describing a wave propagating to the left and to the right, 
respectively (a fact you can see by sketching (rt, x) as a function of x at different values of rt). Our subsequent 
discussion is a bit cleaner if we use units so that the constant c, is effectively equal to 1. (This is completely 
analogous to the astronomical practice of using light years as a measure of distance.) 

The general solution @(t, x) = f,(t + x) + fr(t — x) is practically begging us to use the coordinates 


x~=Sttx (22) 


instead of (t, x). We then have 0. = sz = 2 (0, + 4,). To see the last equality, simply act with both sides on x*. 
(To save writing, we use the notation 0, = £ and so forth.) We thus have the string action (up to an irrelevant 


G 
overall constant and with gravity turned off) 


dp d¢ 
axt+ ax- 


s= | dxat@,0" - (0,6)"1=2 f avtax- 


and the equation of motion (2 b= 0, or even simpler with the x* coordinates = 0. By the way, 


s 2 iF fe 
for reasons that will become clear ister the coordinates x* are known as light cone coordinates. 

As | already mentioned, a dynamical variable such as $(t, x) that depends on both time and space is known 
as a field, just as an electric field or a magnetic field depends on time and space. Much more on fields later. 
Here let’s recall the bad notation alert in chapter I.1. In this example, in particular, you see clearly that x is not a 


dynamical variable but a label telling which segment of the string we are talking about. 


Appendix 3: Baby string theory and a sneak peek at the Lorentz 
transformation 


Recall that when we discussed the energy funciona of the membrane in chapter II.1, we used an observation 


from chapter I.4 that the expression ( aay + (22 ey? transforms like a scalar (see exercise 1.4.1) under the rotation 
x’=cosOx+siné y, y’ ay adn a 

The expression (3,6)* — (0,)? we encountered here in the action for the string looks similar except for a 
crucial minus sign. We naturally wonder if there might be a transformation similar to rotation that would leave 
this expression invariant. 

A clue is provided by the fact that to show x” + y” = x? + y?, we need the trigonometric identity cos? @ + 
sin? 6 = 1. Let me remind you that the hyperbolic functions, cosh g, sinh g, and tanh 9, are defined in analogy 
to the trigonometric functions (and are in some respects even simpler): 


sinh g 


1 1 
cosh y = 5 +e *), sinhg= ae —e*), tanhg= (23) 


cosh g 


from which the hyperbolic identity cosh” y — sinh* y = 1 follows. 
Let’s try transforming ft and x using hyperbolic cosine and sine instead of trigonometric cosine and sine. We 
see immediately that with 


t'=coshgt+sinhgx and x’=sinhgt+coshyx (24) 


we have ¢’? — x = t? — x”. Furthermore, 3,6 = ae — On oh OX an" 9, ip = cosh y 0,¢ + sinh y 0,/¢. Similarly, 0.¢ = 
sinh y 0, + cosh g 0,. We then verify that, wiideed: (a6)? — (0,@)* = (8,6)? — (8,)?. The string action is 
left invariant by the transformation (24). Everything parallels the corresponding discussion for rotation of the 
action for the membrane. 

We will see in chapter III.3 that although the string discussed here obeys Newtonian physics and has nothing 
to do with special relativity, the same transformation, known as the Lorentz transformation, appears in Einstein’s 
theory of relativity. It is pusing that the transformation is foreshadowed by baby string theory. 

To show the power of the x* coordinates, we note that the it. A oe ca 
x* and divide x~ by the same factor, call it e?. But the Sane abe xt = ext, x 


is obviously left unchanged if we multiply 


'~ =e -%x~ when written 
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in terms of t and x immediately translates into the transformation above. (For example, t’ = LG? +x )= 


5 (e?xt + e~¥x-) = cosh yt + sinh gy x.) With the right choice of coordinates, we don’t even need an inspired 
guess.* 


Appendix 4: Particle on a sphere 


Here is a fun problem. Determine the motion of a particle on a sphere with unit radius. Some features of the 
calculation we do here will come in handy when we get to anti de Sitter spacetime in part IX. Denote the position 
of the particle by X = (X1, X?, x3), satisfying the constraint X=. (Henceforth, we suppress the arrow to 
minimize clutter.) 

The problem is best solved with a double dose of Lagrange, using the Lagrangian and the Lagrange multiplier? 
to implement the constraint. So, write L = 5X? + 5A(X? — 1), with A the Lagrange multiplier. (We recognize 
this as a special case of (11). We have also chosen units in which the mass of the particle is unity.) Here L is to 
be regarded as dependent on X and i. 

Plug this into the Euler-Lagrange equation (7). For g taken to be A, we recover the constraint X* = 1, while for 
q taken to be X, we obtain Xi =axi (with i = 1, 2, 3). Differentiating the constraint, we have X - X=0 (where 
we indicate the dot in the dot product to remind us that we are dealing with vectors). 

Now notice by direct differentiation that the quantity J‘/ = X'X/ — X/X! does not depend on time: J‘/ = 0. 
In other words, it is conserved. Do you recognize it? Angular momentum of course! (We will have a lot more 
to say about it in the next chapter.) Define 2J* = J J‘) = 2(X'X/ — X/X")X'X/ =2X*. Hence X* = J’ is also 
constant. Recognize it? Total energy. 

This last equation has the solution X! = a 
a2 =b?=0, 2a-b=1. Note that for X' to be real, b=a". (Of course, if you prefer, you can express the solution in 


igiJt + pie-iJt, with a and b two constant complex vectors satisfying 
terms of sine and cosine.) The constraint 1 = X? is also satisfied. Hence, the motion of the particle is completely 
solved. 

Geometrically, it is clear that the particle travels along great circles. Let’s verify this. What is the distance D 
traversed by the particle between times ¢, and t,? We have 


p= | Jae= favP=s farase-1 


In contrast, at time t,, X; = ae!/" + be'/", The position X, at time t, is given by a similar expression with r, 
replacing 11. The angle 0), between the two vectors X, and X) is then given by cos 0), = X1-X2=2a-bcos J(t, — 
ty) = cos D, since 2a - b = 1. We obtain, as expected, D = 64. 

The attentive reader might have noticed that we need not determine the Lagrange multiplier, but it is easy to 
do. Using X?=1andX-X =0,wefind J X/ = J*X! and J/X/ = —X'. Butifwe differentiate the last equation, 
we obtain —X! = J‘/ XJ = J?xX'. Comparing with the equation of motion obtained earlier, we find A = —J?. 


Exercises 


1 Show that for a harmonic oscillator, the actual path can be either the maximum or the minimum of the 
action. 


2 Suppose the falling particle in (1) follows an inverted parabola q,(t) = h(t — T)?/T?. By the argument given 
in the text, the corresponding action must be higher than the actual action. Calculate S(g2) and show that it 
is indeed higher. 


3 Show that for the action S(q) = f dt{ (ay — gq} discussed in the text, with initial and final conditions 
q(0) = q(T) =0, the action evaluates to — ner’ for the actual path g(t) = — 5att — T). If the particle 
stayed on the couch, its action would have been 0. 


* If g = 0, the physics is invariant under a much larger class of transformations, as you will see in exercise 5 
and later in chapter IX.9. 
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2. 


4 Vary S=2 fdxtdx~ _ oe to obtain the equation of motion = =0. 
5 Show that for g = 0, the dynamics of the string is invariant under x’ = f,(x*), x” = f_(_), with f, and 
f_ two arbitrary smooth functions. 
6 Solve the isoperimetrical problem. You know the solution is a circle. 
Notes 
1. If it ever comes to a priority dispute, Fermat would have to cede to Heron of Alexandria (circa 65 Ap). 
2. K. Barner, “How Old Did Fermat Become?” NTM 9 (2001), p. 209. 
3. R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals; also, QFT Nut, chapter 1.2. 
4. QFT Nut. 
5. Let me remind those readers who are a bit shaky about the Lagrange multiplier that I gave a brief review in 


endnote 8 in chapter I.7. 


| | A Symmetry and Conservation 


“Spiritual formulas” 


Pure mathematics is, in its way, the poetry of logical ideas. 
One seeks the most general ideas of operation which will 
bring together in simple, logical and unified form the largest 
possible circle of formal relationships. In this effort toward logical 
beauty spiritual formulas are discovered necessary for the deeper 
penetration into the laws of nature. 


—Albert Einstein, writing about Amalie Emmy Noether 
(1882-1935) 


Symmetry! and conservation played, and continue to play, intertwined and central roles in 
physics. A set of transformations that leaves physics unchanged is said to be a symmetry. 
The example of angular momentum conservation, as discussed in chapter 1.2, strongly 
indicates that symmetry and conservation are intimately related. In this chapter, we will 
have a general discussion showing that this is indeed the case. 

I give immediately two concrete examples. 


Example A 


A particle moves in 2-dimensional space under a rotationally invariant potential, that is, a 
potential without a preferred direction: 


_ 1. | fdx\? (ayy 
r= ((4) +(2)| a (1 


where, as usual, r = \/x? + y?. 
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Example B 


Two particles move in 1-dimensional space interacting with a translation invariant poten- 
tial, that is, a potential that depends only on the distance between them: 


dqz \_1_ |(dq\’ , (da\ 
L(@ 14a) = 30 ( at +(22)| Vlqi — 4al) (2) 


Let us consider a Lagrangian Le “4, qa) depending on a certain number of gs, which 


we denote as q,. You see that our notation is flexible enough so that the index a can have 
entirely different meanings in different examples. In example A, it labels the x and y 
coordinates ofa single particle. In example B, it labels two different particles. In general, we 


can have n particles moving in D-dimensional space, so that a = (a, i), witha =1,---,n 
andi =1,---, D. In other words, the index a labels the different degrees of freedom, and 
L(“« , Ja) denotes ran, “aw ,91,***,4n) with N =nD degrees of freedom. 


With this rather general setup, we proceed. We say that our Lagrangian exhibits a sym- 
metry if it remains invariant under an infinitesimal transformation g, > qq + dqq. It 
suffices to specify an infinitesimal transformation, since a noninfinitesimal transforma- 
tion can be built up by repeating infinitesimal transformations, as explained in chapter I.3. 

Again, let us hasten to our concrete examples. The Lagrangian in example A does not 
change under the transformation x > x + ey and y > y — ex. Recall from chapter I.3 that 
this is just a rotation through an infinitesimal angle «. The Lagrangian in example B does 
not change under the transformation gq, > q, +«€ and q2 > qa +e. 


Profundity and simplicity 


We are now ready to prove Noether’s theorem. As is often the case with the most profound 
theorems in theoretical physics, the proof is astonishingly simple. The statement that a 
Lagrangian does not change under an infinitesimal transformation q,(t) > qq(t) + 5qq(t) 
can be written as 


dda 6L 
=6L= r) 3 
>((; ) (42) += | 3) 
Under the transformation q, > qq + 5q,, we have, by differentiating, ada > 4 (Ga + 
5qq) = 4 + 48qq, so that 5(“4) = 45q,. Thus, (3) becomes 


v= Xu (; ) 4 84a + ~ se (4) 
dt a 


To keep things uncluttered so that you can see more clearly what is going on, I ask you 


to hold the summation }°,, in your head. I now suppress the index a and write (4) as 


6L \ d 6L 
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What do we do now? The only weapon at our disposal is the Euler-Lagrange equation 


éL d { 6L 
oq dt (2) (6) 
q at 
Putting this into (3), we have 
6L \ d d [{ 6L 
0=( i) Saas 4 ( i) (7) 
oat oar 
Lo and behold! The two terms on the right hand side combine, and we obtain 
d [{ 6L 
gat (2) 
oat 
In other words, the motion is such that 
éL 
Q= 5g (9) 


dt 
does not change in time. Q is conserved! (Restoring the sum in your head, we have 
5L 
Q=)., 5 44a 8qa:) 
dt 


This is Noether’s theorem, proved in that momentous year for physics 1915: for every 


transformation that leaves the Lagrangian unchanged, there is a conserved quantity. 


Applying Noether’s theorem 


Iam sure that you are eager to see how this actually works in practice. So let us jump to 
our two examples on the double. We just have to plug in (9). 


Example A 
| MO te OER Ns oe gay 
one (Bev Se o)=on (of 2) (10) 


Don’t forget the sum 5”, that you were holding in your head! That is why there are two 
terms: the sum runs over x and y. 

A trivial remark is that we can drop the overall constant « now that its job is done. 
The more important remark is that the conserved quantity m(y& - x®) is just the an- 
gular momentum. Thus, the conservation of angular momentum follows from rotational 


symmetry, as we suspected in chapter 1.2. 


Example B 


o=em (4 + 7) = em (Sit 4 Se) (11) 
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We recognize this as just the conservation of total momentum! The conservation of 
momentum follows from translational symmetry. 


Utter generality 


With practice, you will soon get the hang of it. You are given some Lagrangian, and after 
staring at it, you realize that it is invariant under some transformation. Then you simply 
plug the eae or dq namely 5q,, into Noether’s formula. There you are! The conserved 
quantity is )~, St 8qq. Here I have restored the sum over a. 


Note that the ae does not refer to the specific form and content of the Lagrangian, 
except for the fact that the Euler-Lagrange equation holds. 

Noether’s formula is utterly general and holds, all the way up, for quantum field 
theory, for string theory, and in fact, for any theory that respects the action principle. 
For the ue case of Newtonian mechanics, the Lagrangian has the generic form L = 
Yaa (442)? — V(q1, 92, -. -), in which case the conserved quantity corresponding to 
translation invariance is given simply by )*,, my (4) qq. Note that I have allowed for the 
possibility of the particles having different masses. Thus, for empl if the two particles 
in example B have different masses, the conserved quantity is m2 a m 22, 


Energy conservation 


Staring at (3), the astute reader might have pda oe even if 5L does not vanish, but 
as long as it is a time derivative ame) that 5L = “ for some K), we can still proceed. 
When we get to (8), we now have ““ on the left hand side instead of 0: 


dK _ d oL 
<a, 84 (12) 
‘dt dt \3%4 a 


We still have a conserved quantity: 


= ot 5q 13 

c= 54 (13) 

You might think that this is an obscure clause, a technicality fit for some nattering math- 
ematical nitpickers. What kind of transformation would change L into this special form? 
But you would be wrong. Energy conservation provides an example of this phenomenon. 

Consider a Lagrangian without any explicit time dependence, such as the Lagrangian 
in our examples A and B. ve the transformation be an infinitesimal Hasler in time: 
qto>gqt+eaoyxrqnt+ gen a. bile are just shifting the argument of q(t) and “4 ot (t) inside 
L by € and thus, obviously, 5L = «4; in other words, we have K = €L. Picea dq =€ “4 
into (13) for the simplest case of a particle moving in a 1-dimensional potential, we find 


_ (4444 _,\_- (4 (44) _ (2, (42Y =e (4m (4) 
ome (mae 1) =«(m(%) (3» (4) vin)) =« (3m (4 ee 


(14) 
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which is, up to the no-longer-relevant overall constant ¢, just the total energy. Note that in 
contrast to chapter I.2, here we simply turn the crank. 

To understand better what is going on, it is instructive to look at this derivation from a 
somewhat different point of view. Start with the action 


r 1 
s- | dt { mg? - via (15) 
Now consider 
T 
v= f dt { mac +e) -—Viq(t +o} (16) 
0 


It is crucial to realize that we are holding the limits of integration fixed, so this is not just a 
trivial shift of the dummy integration variable by «. Thus, S’ 4 S. Expanding the difference 
to linear order in €, we have 


T 
is SSS i dt(mq@g — V'(q)q) (17) 
0 
We now evaluate 5S in two different ways. First, using the equation of motion, we can 
write (17) as 


T T T dV 
bS =e / dt(mg — V'(q))q =€ / dt(—2V'(q))q = —2e i; dt — 
0 0 0 dt 


Second, without using the equation of motion, we write (17) as 
T 
d {1 
ss=s'—s=e | dt— (| -mq*-—V 
as ( re (q) 


Equating these two expressions for 5S, we obtain ihe dtt (5mq? + V(q)) = 0. (Note how 
the relative sign between the kinetic and potential energy terms flips at this point.) Doing 
the trivial integral and defining E(t) = 5mq? + V(q), we obtain E(T) = E(0). 

Perhaps Noether’s infinitesimal transformations here reminded you of Lie’s infinitesi- 
mal transformations in chapter I.3. Clearly, there is a fruitful connection to be exploited. 
Just a remark to whet your appetite for more. 


Exercise 


1 Derive the conservation of angular momentum in 3 dimensions using Noether’s theorem. 


Note 


1. Fearful. 


Recap to Part Il 


We find the shortest path by the incredibly clever trick of comparing the length of different 
paths, basically the same method used by a colony of ants. The ants send out zillions to 
try out different paths to the honey. We adopt the same idea. They exploit pheromone 
evaporation; we use the variational calculus. Euler and Lagrange proposed to change things 
a little and see what happens. 

Mysteriously, all of fundamental physics is governed by the action, from which the 
equations of motion follow. 

A profound truth is that the conservation laws are due to symmetries. 


Part III | Space and Time Unified 


| | | | Galileo versus Maxwell 


| am convinced that the philosophers have had a harmful effect 
upon the progress of scientific thinking in removing certain 
fundamental concepts from the domain of empiricism, where 
they are under our control, to the intangible heights of the a 
priori... . This is particularly true of our concepts of time and 
space, which physicists have been obliged by the facts to bring 
down from the Olympus of the a priori in order to adjust them 
and put them in a serviceable condition. 


—A. Einstein! 


Galilean transformation 


Go back to the prelude, in which Galileo's ship was updated to Einstein’s train. The observer 
on the train, Ms. Unprime, ascribed to some event the spatial coordinates (x, y, z) and 
temporal coordinate t. To the same event, the observer on the ground, Mr. Prime, assigns 
the coordinates (x’, y’, z’) and t’. Denote the speed of the train by u, and choose the axis 
so that the train moves along the x-axis. Then the two sets of coordinates are related by 


t=t 

x’=x+ut 

yoy 

g=z (1) 


a set of relations known as the Galilean transformation. Consider a point on the train with 
x = 0. Plugging this into (1), we see that, for Mr. P on the ground, this point moves along 
according to x’ = ut =ut’. 

The innocuous looking equalities y’ = y and z’ = z actually represent an important 
consequence of Galileo’s relativity principle. Call the y direction the vertical direction. We 
can supply sticks of a standard length L to Ms. U and Mr. P to build a fence. 

To make sure that the sticks supplied to the two observers are identical, we can arrange 
for the woodcutter to ride in a train going by at speed 5u relative to Mr. P and —}u 
relative to Ms. U. In other words, the coordinates of the woodcutter are given by t’ = 1, 
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and x’=x, + 5Uly. As far as the woodcutter is concerned, he is at rest, and Mr. P and 
Ms. U are going by him at the same speed but in opposite directions.” The woodcutter 
can toss the pre-cut sticks in identical ways to the two observers and their helpers. This 
long-winded digression is to answer any objection that the tossing of sticks from Mr. P to 
Ms. U, say, could have done something to the lengths of the sticks. 

The top of the two fences is then given by y = L and y’ = L, respectively. The two lengths 
must agree, because as the two fences sweep past each other, the two observers could see 
whether one fence is taller than the other. In either case, Galileo’s relativity principle, 
stating that two observers in relative uniform motion could not decide who is moving 
relative to the other, would be violated. Thus, we must have y’ = y. Similarly, z’ = z. The 
coordinates perpendicular to the direction of motion are unaffected by the motion. 

The relation x’ = x + ut certainly does not violate Galileo’s principle, since x = x’ + 
(—u)t'. To Ms. U, she is at rest, but relative to her, Mr. P, sitting at x’ = 0, is moving with 
speed —u in the x direction. 

We have set up the coordinates so that when tr’ =1 =0, we have x’ =x =0. Just as 
in chapter I.3, we can avoid having to line up the origins of the two coordinate systems 
by considering the separation between two events E, and E, in spacetime located at 
(41, X41, Y1, 21), (Gh, Xp, Vy 24), (ta, X2, Ya, Zz), and (6, x5, y5, 25). Writing At = t, —%, Ax = 


X2 — X,, and so on, and At’ = #5 — t}, Ax’ = x5 — x}, and so forth, we have 


At'= At 

Ax’ = Ax +uAt 

Ay’ = Ay 

Az’ = Az (2) 


Since the y and z coordinates are just going along for the ride, we omit writing the 
transformation equations for them henceforth. Again, just as for rotations in chapter I.3, 
we can replace the finite differences At, Ax, and so on by infinitesimals dt, dx, and so 
forth: 


dt' = dt 
dx' = dx + udt (3) 
Adding velocities 


The addition of velocities is so physically intuitive that almost everybody grasps it in 
everyday life. You are in a car speeding down the highway at 70 miles an hour. A fly trapped 
in the car flies forward at 3 miles an hour. To a hitchhiker standing by the roadside, the fly 
evidently moves forward at 70 + 3 = 73 miles an hour, even though flies normally can’t fly 
that fast. Indeed, if the hitchhiker also sees the fly moving forward at 3 miles an hour, it 
would have smashed into the rear window in an instant. 

To formalize this intuitively obvious understanding, let us go back to the train. Ms. U 
tosses an object forward with velocity v; in other words, the object’s trajectory is described 
by x = vt. (See figure 1, showing Ms. U as a stoker on Einstein's train.) 
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Figure 1 A lump of coal is tossed forward on a moving train. (Illustration adapted from 


Fearful.) 


Simply plug this into (1) and we obtain, very slowly and carefully, the velocity seen by 
Mr. P: 


=e = tue S tusvtu (4) 
t 


We just add the velocity of the object to the velocity of the train, as everybody would have 
felt intuitively. We can obtain the same result, perhaps a tad quicker, by going to (3) and 
dividing dx’ by dt’ to obtain 


/ 
pO _ ak tudt de ay (5) 
dt’ dt dt 


The calculus book I read in high school warns the reader sternly that ee is a holistic 
(but of course that word did not become fashionable until much later) symbol of a single 
mathematical entity and is not to be thought of as dx divided by dt. I am telling you that 
at the level of rigor of theoretical physics it is okay. Just think of the differential dx as 
the difference Ax, divide by At, and then take the Newton-Leibniz limit. When we get to 
general relativity, we will be constantly manipulating differentials. 

We now see that the invariance of Newtonian mechanics under the Galilean transfor- 
mation follows merely because Newton’s law involves the second derivative, so that 

d?x' d (¢ ) d?x 
+u 


no ae (6) 


Ls ni 
dt’? dt’ 


An important point here is that this derivation even tells us when Galilean invariance 
of Newtonian mechanics fails. If wu changes in magnitude or in direction (we had chosen 
u to point in the x direction, but u is really a vector!), then (6) is changed to 


= du 
F’ = ma’ =ma +m— 7 
at (7) 
An ancient part of our brains interprets this extra term as an apparent additional force: 
our body feels it when the driver of the car (remember, the one with the fly trapped in it) 


speeding down the highway suddenly slams on the brake or zips around a sharp curve. 
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Even someone as dumb as a fly would feel the additional force m dit as it smashes into the 
windshield. Unfortunate as well as dumb. 

But for now, what the fly knows is advanced stuff for us; we will get to it when we discuss 
gravity. Let us check that the action for Newtonian mechanics is Galilean invariant. First, 
for simplicity, look at the action for a single free particle in one dimension: 


1\ 2 2 
s= [ ar'ym as = [ ary ey 
dt! dt 
2 
= [ aim (+) +0 f dim 40° f ary (8) 
t t 


The extra term linear in wu in the Lagrangian is proportional to the integral of the derivative 
a With fixed initial and final conditions, it is just an irrelevant additive constant. The 
term quadratic in wu is also an additive constant. In other words, the change in the action 
S is just some additive constant whose variation vanishes. 

This simple demonstration can be immediately generalized to the many-particle case 
with 


2 
s= far So 5m, (S:) —S2 Vxq = x5) (9) 


Note that it is necessary for the interaction potential to depend on the difference x, — x, = 


x/ — x). The generalization to higher dimensional space is trivial. 

Incidentally, you might have noticed that implicit in the argument is the assumption that 
the two observers in relative motion agree on the same mass. I have underlined this by 
writing m explicitly in (6) and (8). There is no m’. Galilean relativity requires that different 
observers measure the same mass. 

Contrary to what the guy in the street might think, the principle of relativity did not start 
with Einstein, but, in a sense, was reestablished by Einstein’s special relativity. 


Showdown between Galileo and Maxwell 


While the addition of velocities (4) is so intuitively obvious, even to a layperson not versed 
in physics (as in my everyday example of a speeding car), it came to play a central role 
in the looming crisis that confronted physics toward the end of the 19th century. In his 
monumental work, Maxwell finally gave a precise elucidation of the mystery of light, 
revealing it to be an undulating electromagnetic field. An electric field varying in space 
and time generates a magnetic field varying in space and time, which in turn generates an 
electric field varying in space and time, and thus the wave propagates through space and 
time. The speed of propagation c depends only on how oscillating electric and magnetic 
fields generate each other, and that, as the reader may recall or have heard, does not depend 
on the observer. 

On that occasion with the fly in the car, I was riding in the back seat, and I had a camera? 
with me. I took a picture of a friend riding in the front seat next to the driver and the 
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flash went off. Telling my friend that the speed of light is* 186,000 x 3,600 = 669,600,000 
miles per hour, I asked my friend how fast a hitchhiker standing by the roadside would 
have seen the flash of light go by. Her answer, indeed the only intuitively reasonable and 
incontrovertible answer, was 70 + 669,600,000 = 669,600,070 miles per hour. 

But this contradicts Maxwell’s equations. 

To read this book, for the most part you do not need to have completely mastered 
Maxwell's theory of electromagnetism (although it would help). I will even derive it later. 
At this point in our development, the single most important point is that light does not 
obey the law of addition of velocities (4) that everyone took to be totally obvious. For light, 
both observers measure the same speed: 


c=c (10) 


As I mentioned in chapter I.1, in the showdown between these two equations, (4) and (10), 
the law of addition of velocities blinked and had to be modified. 


This great antinomy made him stuck 


Various eminent physicists in the late 19th century realized that they could reconcile 
the contradiction between Maxwell’s theory and the law of addition of velocities if they 
postulated that light, just like sound, had to propagate in a medium, an ether pervading 
the universe. The speed of light c determined by Maxwell’s theory is the speed of light as 
seen by an observer at rest with respect to the ether. As the earth moves through the ether, 
the speed of light measured on earth would vary. 

Notice that the existence of the ether would have profound implications for the foun- 
dation of physics, namely, that absolute rest could be defined as rest with respect to the 
ether. Ms. U and Mr. P could determine who is at rest and who is moving. 

As you may have heard, the experimental evidence was against the infamous ether. 
In 1887 (when Einstein was 8 years old), Michelson and Morley performed a famous 
experiment to detect the ether and failed. By the way, Einstein claimed that he was guided 
solely by Maxwell’s equations and had never heard of the experiment. 

Indeed, Einstein even contemplated his own experimental setup to look for the ether. 
In an impromptu speech given in December 1922 in Kyoto, Japan, describing how he had 
discovered special relativity, he said that he had not doubted the existence of the ether 
and that he had even thought of an experiment using two thermocouples to measure the 
difference in the heat generated by two light rays, one moving in the same direction as the 
earth, the other in the opposite direction. 

A pair of thermocouples to measure the difference in the heat generated by the two 
light rays, yeah right! You might have smiled: good old Albert was a far better theorist 
than experimentalist. Michelson and Morley had a far better idea, to interfere the two light 
rays. Putting that aside, you could sense Einstein’s frustration. In 1922, he said something 


* Since this true story took place in southern California, we use “royal” rather than “revolutionary” units here. 


164 | III. Space and Time Unified 


to the effect that this “great antinomy,” between Maxwell's equations and the addition of 
velocities, had really made him stuck. 

Theoretical physicists love nothing better than a major contradiction between two well- 
established results, each seemingly beyond reproach. In this case, it was a shoot-out 
between electrodynamics and the addition of velocities. Einstein and his contemporaries 
were inclined to blame electromagnetic theory. The law of addition of velocities seemed 
rock solid. It took the cumulative genii of Lorentz, Fitzgerald, Poincaré, Einstein, and 
others to suspect that something was wrong with (1). 


Appendix: Galilean invariance and fluid dynamics 


Most texts pass over the Galilean transformation in a headlong rush toward special relativity. I like to mention 
that in fact, Galilean invariance offers us a powerful and often useful constraint*+ on Newtonian physics. 

You may or may not know that much of fluid dynamics is governed by the Navier-Stokes equation (with v 
denoting the viscosity, p the density, and P the pressure): 


eee ae oN 
Y4-V)i=vV75- =VP (11) 
ot p 


The pressure gradient V P provides the driving force, and the appearance of the mass density p comes from the 
m in F = ma. | will now give you a quick derivation using Galilean invariance. 

Suppose that Ms. Unprime wants to study fluid dynamics but has never heard of the Navier-Stokes equation. 
She proceeds to write an equation for a, where v(t, x) is the fluid velocity at the point x at time t. What are 
the possible terms in this caution? By rotation invariance, we are to construct vectors out of what is available, 


namely v(t, x) and v= CB Be aS 


ax? dy? Jz?" 
The key to the symmetry approach presented here is to require that whatever equation Ms. Unprime writes 
down has to be the same as what Mr. Prime writes down. Mr. Prime sees the fluid moving with the velocity 


v(t’, x’) = v(t, X) + iu. In what follows, it is sometimes convenient to Bie u to point in the x diechon, but it is 


also easy to write (1) a bit more re generally: t'=t, x’ =x+ut. First, a a a } oe a ue that is, V=V. 
Next, we have ac = or a { s *, Ve a ii - V (as usual, the symbol 4, ; indicates that the partial derivative is 
to be taken with x’ held fixed, and so in the last step, since xX = x’ — uit’, we have ax =—i/). 


Let us now express what Mr. Prime writes down in terms of what Ms. Unprime would write down. First, 


ae = coe —(@-V)G+H) = ao — (i - V)%, since it is constant by assumption. Next, observe that 


VF =(G+H)- VG+ED=G-VI+G-V)/I 


Thus, we learn that the differential operator 


a ‘ 5-0 ih ee he Bg Ue a ee O) 3 Es 
We a = 8 G&-HEE- DIAG DI= M4 E-H3 (12) 
at’ ot ot 

is invariant under Galilean transformation. Thus, Galilean invariance mandates that the combination Be = 


oe + (6 - V)é appears in the equation for fluid flow. We also note, more trivially, V0! = V3. 
Therefore, requiring that Mr. Prime and Ms. Unprime observe the same physics, we arrive at the Navier-Stokes 
equation? (11). 


One final comment based on symmetry: under time reversal, ue and v change sign, but not V. Hence, the 


ot 
term Vi in (11) violates time reversal. Since in Newtonian physics, time reversal is violated by friction, we can 


identify the coefficient v as a measure of viscosity. If v = 0, then (11) is known as the Euler equation. 
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Exercise 


1 Insolving problems in mechanics, when we go from the lab frame to the center of mass frame, or vice versa, 
we are invoking Galilean invariance of Newton's laws without saying so explicitly. Here is a classic example. 
Let a billiard ball hit another billiard ball at rest elastically head-on. Show that the two balls move off at right 
angles to each other, as every pool shark knows. 


Notes 


1. A. Einstein, The Meaning of Relativity, p. 2. 


2. We are implicitly assuming that even if the tossing of sticks might have done something to their lengths, 
this effect does not depend on whether a stick is tossed to the right or to the left. Alternatively, Mr. P and 
Ms. U could toss pre-cut sticks to each other. 


3. By the time this book was finished, the camera had morphed into a cell phone. 
4. For application to a problem on surface growth, see QFT Nut, chapter VI.6. 


5. It is instructive to compare this symmetry-driven derivation with the standard textbook derivation, for 
example, J. S. Trefil, Introduction to the Physics of Fluids and Solids, Pergamon Press, 1975, pp. 5 and 127. 
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Alice said, “In our country, there’s only one day at a time.” To 
which the Red Queen responds, “That’s a poor thin way of doing 
things. Now here, we mostly have days and nights two or three 
at a time, and sometimes in the winter we take as many as five 
nights together—for warmth, you know.” 


—Lewis Carroll 


The patent clerk invents a clock 


So, how are we to modify the Galilean transformation laws so that the speed of light c does 
not obey the everyday understanding of how velocities add? Somehow, particles of matter 
(my friends and I, and the fly, in the story from the previous chapter) and particles of light 
(the camera’s flash) do not tally the passage of time and space in the same way. 

In the prologue, we saw how Einstein, through a thought experiment, showed that 
simultaneity must fail. In another elegant thought experiment, Einstein proposed a clock 
consisting of a pulse of light bouncing between two mirrors separated by distance L 
(figure 1a). He was, after all, a patent examiner living in a time of technological innovations! 
of all sorts, including ever-better chronometers.” Ms. Unprime has one of these high-tech 
clocks with her. For each tick-tock, three events occur: A = light leaves the lower mirror, 
B = light bounces off the top mirror, and C = light arrives back at the lower mirror. 

Let us write down the separation between events A and C in space and time. Evidently, 


Ax =0, Ay =0, Az =0, since the pulse of light gets back to where it started. By construc- 
tion, At = 2L/c. 
Mr. Prime, the observer on the ground, watches the train carrying Ms. Unprime move 


by with speed u in the x direction and sees a pulse of light bouncing up and down in the y 
direction. What is the separation between A and C as seen by Mr. Prime? Since he sees the 
clock moving along the x-axis, he notes that Ay’ = 0, Az’ = 0 (that’s what “moving along 
the x-axis” means). But Ax’, unlike Ax = 0, is nonzero and is given by Ax’ = uA?’ (that’s 
what “moving with speed uw” means). 

But how do we determine Ax’ and A?’ separately? 

Use the fabulously astonishing equation c = c! 
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kK suAt’—+— juAt’— 
(b) 


Figure 1 Einstein’s clock in its rest frame (a) and in a 
moving frame (b). 


It follows then that Az’ is the distance traveled by the light pulse divided by c. But what 
is the distance traveled? Ask Mr. Pythagoras for help! We have two right-angled triangles 
back to back, each with right sides (figure 1b) with lengths ju Ar’ and L, and hypotenuse 


(luar’)’ + L?. So, between tick and tock, light travels a distance of 2 (fuar’)’ + L2, 
and hence 
2 
cA = 2) (duar) +1? (1) 


Anybody who got a passing grade in high school algebra could solve this equation to 
determine A?’ and hence obtain Ax’. (See exercise 1.) But a much more clever strategy is 
to recall that Ax’ = uA?’ and to substitute this into (1), obtaining cAt! = 2,/ (5 Ax’)? + L?. 
Now square this equation to obtain (cAt’)? = 4{(5 Ax’)? + L?]. Lo and behold, we have 


(cAt')? — (Ax')* =4 (ga) + 1] — (Ax')* = 41? = (cAt)* = (cAt)? — (Ax)* (2) 


since Ax = 0.A fortiori, since Ay’ = Ay and Az’ = Az this also implies (c At’)? — (Ax’)? — 
(Ay’)? — (Az’)? = (cAt)* — (Ax)? — (Ay)? — (Az)*. 


We can now consider an observer named Double Prime, with respect to whom the 
m2 _ 


mirrors are moving at some other speed along the x-axis. By the same reasoning, (cAt 
(Ax”)* — (Ay”)* — (Az)? = (cAt)* — (Ax)? — (Ay)* — (Az)*. Thus, we conclude that the 
quadratic form (cAt)? — (Ax)* — (Ay)* — (Az)? must be the same? for all observers in 


uniform motion relative to one another. 

By this clever thought experiment, Einstein used the Pythagoras theorem for space to 
obtain a sort of generalized Pythagoras theorem for space and time. 

Distinction between a very good physicist and a great physicist! A very good physicist 
knows math (high school algebra in our case) and can solve equations (solve for At’ in our 
example), but a great physicist listens to what the equations are telling him or her (that 
Nature likes Pythagoras theorem so much that she wants to generalize it!). 
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Lorentz’s transformation 


He meant more than all the others | have met on life’s journey. 
—Einstein speaking of Lorentz 


What a guy! Did he mean to include family and friends in “all the others”? 

Let us now see how we can modify the Galilean transformation (III.1.1), so that (¢Ar)? — 
(Ax)* — (Ay)* — (Az)? does not depend on the observer. 

First, the relation between (t’, x’, y’, z’) and (t, x, y, z) must be linear, since nothing 


prevents us from scaling {t’, x’, y’, z’, t,x, y, z} by a common multiplicative factor (in 
other words, {t’, x’, y’, 2’, t,x, y, z} > fat’, Ax’, Ay’, Az’, At, Ax, Ay, Az}). Thus, we can’t 
have something like t’ equal to t + ax* with some constant a. The relation has to be linear.* 

Second, we have the seemingly innocuous requirement that as u — 0, the transforma- 
tion must reduce to the Galilean transformation. But, importantly, notice that before the 
realization that c is a universal quantity of the universe, dimensional analysis alone would 
have stopped our effort cold at this point. Without c, we have only x and u to play with, 
and so the only quantity with dimension of time is x/u. The linearity requirement plus 
dimensional analysis dictates the form t’ = t + ax/u with some numerical constant a, but 
this makes no sense as u > 0. We are forced to t’ = t. 

If c is not a universal constant, we are stuck with the Galilean transformation. But with 
c now off the bench and on the field, suddenly we have a new ball game: the combination 
ux/c? has dimension of time. 

Now we can write t/ = t + ¢ux/c?, with ¢ some function of “ to be determined. But this 
is not yet the most general relation. We could write t/ = w(t + ¢ux/c*), with w also some 
function of 4 to be determined, provided that w(4 = 0) = 1so that we recover the Galilean 
transformation. 

Similarly, we can modify the Galilean relation x’ = x + ut to x'= w(x + ut), where w 
is also some unknown function of 4 such that w(4 = 0) = 1. Notice that we do not write 
x’ = W(x + Cut), with ¢ yet another function of “, because we could simply give the name 
u to the combination ¢u. The relative velocity uw between the two observers is defined by 
the statement that x’ = 0 implies x = —ur. And of course we still have y’ = y and z’ =z. 

Let us impose the requirement that for observers in uniform relative motion, the 


combination (cArt)* — (Ax)* — (Ay)? — (Az)? does not depend on the observer. As already 
mentioned in the prologue, clearly it would be a good idea not to use some dumb English 
king’s foot to measure distance, but instead to use something such as the lightsecond, so 
that’ the speed of light c = 1. The algebra becomes cleaner. 


* Another argument is that if the relation were not linear, free particle motion would look different to different 
observers. 

T In other words, we want to use the same units along the t-axis and x-axis. Similarly, in studying rotations, 
it is a good idea (and obvious common sense) to use the same units along the x-axis and along the y-axis to 
measure length. Think about what rotations would look like using centimeters along the x-axis and kilometers 
along the y-axis. 
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I might add that some beginning students are nervous about c suddenly disappearing. 
Please be reassured that you can always restore c easily using dimensional analysis. 

Since we have written our transformation such that the origins of the primed and 
unprimed coordinates coincide, we can simply demand 1? — x’? = t? — x”. Expanding 
w(t + Cux)* — w(x + ut)? = t* — x”, we obtain 3 equations,* which determine the 3 
unknowns to be w = w = —+= and¢=1. 


/ 1-u2 


We have thus derived the Lorentz transformation’ for a boost in the x direction: 


, ct+ix 
t= 
u 
1 c2 
x + “ct 
x= Cc 
2 
u 
1 7 
fe 
at 
g=z (3) 


with c restored for the reader’s convenience. You are invited to write down the Lorentz 
transformation for a boost in an arbitrary direction u. 

By construction, the Lorentz transformation reduces to the Galilean transformation in 
the domain of everyday experience, namely in the limit u <c. Simply take the c > oo 
limit. 

Since ,/1— us becomes imaginary for u > c, we learned that a universal speed limit 
u <c exists. The train cannot go faster than the speed of light without all of our equations 
breaking down. 


dt+4d 
Note that cdr’ = “te 


# cdt! Our fallacy was that we thought for sure that when 


u 
3 


1 second passed for my ‘friends and I, and the fly, 1 second also passed for the hobo 
hitchhiker. This assumption went into the derivation of Galileo’s common sense addition 
law of velocities, which is so common sensical that we invoke it in everyday life without 
ever feeling the need to prove it. 

There is no universal clock in the universe ticking off the same universal time for 
everyone. 


* Namely w* — wu? = 1, w*e?u? — w* =—-1, and wt = w?, upon equating the coefficients of 12, x2, 
and tx. 


¥ Interestingly, in 1887, the German physicist W. Voigt came close to having this transformation. In Voigt’s 


transformation, the right hand side of (3) was divided by ,/1— Not knowing Voigt’s work, in 1895, Lorentz 
derived the transformation in a better form than Voigt’s. J. Larmor found the exact form in 1900. Not knowing 
Larmor’s work, Lorentz discovered the exact form in 1904. In 1905, H. Poincaré, knowing only of Lorentz’s work, 
developed the transformation further and named it the Lorentz transformation. As for Einstein, he only knew 
the 1895 version of the Lorentz transformation. The term “Lorentz transformation” is an example of the Matthew 
principle: Whoever has will be given more. . . . Whoever does not have, even what he has will be taken from him 
(Matthew 13:12). 
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Light cone coordinates 


So there we have it, the Lorentz transformation that replaces the Galilean transformation: 


(4) 
(We have set c = 1 once again.) 


Remarkably, once we require that ’* — x’* = 1* — x?, the derivation takes only a few 
lines of high school algebra. In fact, the derivation becomes even simpler if we use the so- 


called* light cone coordinates x* = + x I introduced back in chapter II.3. Indeed, there 
I already gave you a sneak preview of the Lorentz transformation, in the context of the 
purely Newtonian problem of a vibrating string, not anywhere near an electromagnetic 
wave and its universal speed c! One of the most appealing features of theoretical physics 
is its unified perspective. 

The key observation is the identity a* — b* = (a + b)(a — b), so that 1? — x* = (t+ x)(t — 
x) =x*x7. (As usual, we consider two observers in relative uniform motion along the x 
direction, with y and z merely going along for the ride and hence asking to be suppressed.) 
Evidently, x+ x7 is left invariant if we multiply x* by some factor and divide x~ by the same 
factor. The Lorentz transformation is strikingly simple in these coordinates: 


x’F =e?xt and x’ =e ?x7 (5) 


with » some real parameter. 

From (5) you can immediately recover (4): 1! = $(x’* + x'~) = F(e?xt + e7Px7) = 
(cosh y)t + (sinh g)x, and similarly x’ = (sinh g)t + (cosh ¢g)x. It is easy to relate the 
boost parameter or angle ¢ to the relative velocity u. From the condition that x’ = 0 implies 


x = —ut, we discover that 
inh 
4 = SEO? — tanh g (6) 
cosh @ 


(Purists might frown at physicists calling y an angle, since it ranges from —oo to +00 as u 
ranges from —1 to +1, but the terminology has the virtue of emphasizing the connection’ 
with the rotation angle.) Using the identity cosh? y — sinh? y = 1 mentioned back in 
chapter II.3, we then obtain 


1 , Uu 
cosh y = A and sinhg= A (7) 


* The terminology will become clear in the next chapter. 
tT See appendix 1 of chapter III.3 for further discussion. 
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The light cone coordinates thus provide a “10-second derivation” of the Lorentz trans- 
formation. In many situations (for example, the development’ of string theory), the coor- 


dinates x* are much more convenient than t and x. In the same way, an earlier generation 
found that rotations are more easily handled by going to complex and polar coordinates 


8 


z=x+iy=re' sae 


and z* =x —iy=re7 


How velocities actually add 


We can now easily derive the correct law of addition of velocities. I will let you work out 
the more general case (exercise 3); here we consider the simple case of an object moving 
in the x direction. The relevant part of (4) reads 
,_ ttux 

el Ee 
,_ xX tut 

— VT= 2 
Let the velocity of an object as seen by Ms. U be v = ie and as seen by Mr. P on the 


round be v’ = 2. Then, dividing dx’ by dt’ as given by (8), we obtain 
8 dt 8 8 


t 


(8) 


a EL dt + dx _ m+ _ utu 
— dt’) dt+udx 1t+u% 1+u0 


(9) 


Instead of v'’ = v + u, the correct law of addition of velocities contains a crucial denomi- 
nator: 


7 ut+tu 
P= 
1l+uv 


(10) 


The function v’ = f,,(v) has some remarkable properties. It is symmetric under u  v, 
as it should be. If the object is slowly moving with v « 1, then v’ ~ v + u, in accordance 
with everyday intuition. But if the object happens to be a particle of light so that v = 1, 


then v’ = fut = 1 independent of u, contrary to everyday intuition. If we solve (10) for 
v= f-(v’), we obtain v = vou = f_,(v’), consistent with f_,(f,(v)) = v, of course. If 


Mr. P sees Ms. U going by with velocity u, then of course Ms. U would see Mr. P going by 
with velocity —u. 

To “feel” how counterintuitive (10) is, imagine yourself carrying the ball in a game of 
American football, running toward the goal line at 9 meters per second. Behind you is 
the safety, chasing after you at 10 meters per second. You feel the safety gaining on you 
at a relatively benign 1 meter per second. But suppose the safety had dropped way back 
toward the goal line and is now coming at you at —10 meters per second. You see him 
fast approaching at a bone-crunching 19 meters per second, a factor of 19 faster! (See 
figure 2.) Now suppose the safety has strapped on a rocket moving at near light speed. 
Then, regardless of whether he is chasing after you or coming toward you, you see him 


. . fe . . 
closing in at v = +4 |,.4. & tc. The rate of approach is almost the same in the two 
oD 


_ 
Cc 
cases, and becomes the same as v’ reaches light speed. 
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Figure 2 Relativity in a game of Ameri- 
can football. 


“As the ancients dreamed” 


In a certain sense, therefore, | hold it true that pure thought can 
grasp reality, as the ancients dreamed. 


—A. Einstein, 1954 


I do not want to go into the historical controversy of whether Einstein knew about the 
Michelson-Morley experiment when he worked out special relativity. 1 am inclined to 
believe his statement that he didn’t. The independence of c on the observer follows from 
Maxwell’s theory, while begging the question of what medium electromagnetic waves 
propagate in. Instead, I discuss the existence of a speed limit. 

Let us imagine how a philosopher (or perhaps an “ancient” in a civilization far far away) 
could have argued. Suppose there is no speed limit. Then something could have gotten 
from here to anywhere else in the universe in an instant. This is clearly absurd. So suppose 
there is a speed limit c. An observer sees this thing moving at the speed c. But another 
observer moving at a speed u in a direction opposite to this thing would, according to the 
Galilean transformation, see it moving at a speed c + u, but this would contradict c being 
a speed limit. Ergo, the common sense Galilean transformation has to be modified. 

This is not how the Lorentz transformation was discovered in our civilization, but it 
could have happened this way elsewhere. To me, the logical fallacy in this argument by 
pure thought is that it may take an infinite amount of energy to make something go at 
infinite speed. Indeed, that is what happens in a Newtonian universe, which is logically 
consistent if you don’t ask disallowed questions such as how the universal clock was set 
up. (Or, who set it up?) 

This brief digression shows why it may be wise to focus on physics. 
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Exercises 


1 


Derive the Lorentz transformation directly by solving (1). 


2 Show that the inverse transformation giving ¢, x, y, z in terms of t’, x’, y’, z’ is given by (3) with u > —u. 

3 Derive the law of addition of velocities in general, when u and ¥ point in different directions. The law must 
obey rotational invariance. 

4 Consider a series of observers, with each observer seeing the preceding observer moving away along the x- 
axis with speed wu. Let the Oth observer see a particle moving with speed vp in her frame. Then the (k + 1)st 
observer sees the particle moving with speed u,4, = ae Find the limiting value of v;, as k tends to infinity. 

Notes 


. According to the literary scholar Dame Gillian Beer, around 1865, when Lewis Carroll, an early practitioner of 


photography, wrote Alice in Wonderland, photography “froze or made portable a moment and a place.” To me, 
that could have easily led to the concept of events in spacetime, as we will discuss in the next chapter. Carroll 
was notoriously concerned with the notion of time, with for example the white rabbit constantly consulting 
his pocket watch, an affectation and necessity when railways, with timetables and Einstein’s trains, came into 
common use. To a physicist like myself, the two Alice books are full of allusions to concepts from physics: 
gravity, scale transformation, and mirror reflection, to name a few. 


. P. Galison, Einstein’s Clocks, Poincaré’s Maps. 
. Some authors state that the invariance of this quadratic form follows immediately from (cAr)? = (Ax)? + 


(Ay)? + (Az)? and (cAr’)? = (Ax’)* + (Ay’)? + (Az’)*. At best, this argument is highly misleading and 
incomplete: if you know only that (cAt’)? — (Ax’)? — (Ay’)? — (Az’)? = (cAt)* — (Ax)? — (Ay)? — (Az)? 
when both sides vanish, you cannot conclude that they are equal in general. You need Einstein and his clock. 


. See, for example, B. Zwiebach, A First Course in String Theory, 2009. 
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Unifying time and space 


Henceforth space by itself, and time by itself, are doomed to fade 
away into mere shadows, and only a kind of union of the two will 
preserve an independent reality. 


—Hermann Minkowski 


In a far reaching move, Hermann Minkowski (1864-1909) introduced geometry into 
special relativity. Some of the notions commonly attributed to Einstein, such as a 4- 
dimensional* spacetime, are actually due to Minkowski. 

In Euclidean space, the invariance of the combination d/* = dx* + dy* + dz* under 
rotation allows us to define d/ as the distance between two points. Two observers whose 
coordinate systems are related by a rotation measure the same distance. Indeed, as I 
emphasized in chapter I.3, the invariance of di* defines rotation. 

With profound insight, Minkowski realized in 1907 (a mere! 2 years after Einstein 
introduced special relativity) that the invariance of the combination dt? — dx* — dy* — dz? 
allows us to talk about the “distance” or “separation” between two points in spacetime. We 
saw in chapter III.2 that this invariance determines the Lorentz transformation. Similar to 
the case of rotation, two observers in uniform motion relative to each other can now agree 
on the spacetime distance! between two points. 


* In fact, psychologists tell us that some of our difficulties in life stem from a natural tendency to view time 
as if it were like a spatial dimension, as reflected in many languages. In English, one says that, for example, the 
past is behind us, we are rushing toward the future, and so on.2 

T As an example of a lyrical confounding of space and time, consider Rudyard Kipling’s line “Damned from 
here to eternity,” which subsequently lent itself to the title of a famous novel and film, not to mention a Yale 
drinking song. Sounds so much better than “from now to eternity,” something that a Galilean physicist might 
have said. 
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Euclidean geometry is specified by the distance between two nearby points in space, 
given by dl? = dx? + dy? + dz?, while Minkowskian geometry is specified by the distance 
between two nearby points in spacetime, given by ds* = —dt* + (dx? + dy* + dz*). The 
quantity ds* naturally generalizes Pythagoras’s d/* and allows us to do geometry in space- 
time.” 

The “modern” way of looking at special relativity is to emphasize the geometry of 
spacetime, an approach which will lead us naturally to general relativity and Einstein 
gravity. 


Distance in Minkowskian geometry 


With this singular minust sign in ds?, the geometry of Minkowski spacetime is definitely 
and defiantly not Euclidean. In particular, the infinitesimal quantity ds? = —dt? + (dx? + 
dy? + dz*) between two nearby points may not even be positive. Conceptually, it may be 
preferable to think of ds? as a peculiar symbol in its own right, not necessarily as the square 
of a real quantity. We say that the separation between two nearby points in spacetime is 
timelike ifds* < 0, spacelike ifds* > 0, and lightlike or null if ds? = 0. The term “lightlike,” 
to be used interchangeably with “null” in this text, is evidently due to the fact that light 
traces out a straight line path in spacetime given by dt* = (dx? + dy? + dz?). 


dt* > (dx*+dy*+dz*) timelike 
dt® = (dx* + dy* + dz?) lightlike or null 
dt? < (dx*+dy*+dz*) spacelike (1) 


The classification timelike, lightlike or null, and spacelike, can obviously be applied to 
curves in Minkowski spacetime, not just straight lines. A curve is timelike ifthe separation 
between any two infinitesimally nearby points on the curve is timelike. In other words, the 
tangent vector on a timelike curve is a timelike vector. The worldline of a massive particle 
is timelike. Similarly, we can define spacelike curves. The worldline of a massless 
particle (like a photon) is lightlike or null. 

It is hardly surprising, then, that many geometrical facts we take for granted no longer 
hold true in Minkowskian spacetime. In particular, a straight line between two points in 
spacetime is not necessarily the path of shortest distance. 

Define the straight line “distance” between two points separated by At, Ax, Ay, and Az 
to be (At)? — (Ax)* — (Ay)* — (Az)*. Consider the triangle in the (f-x) plane formed 
by the three points A= (0, 0), C= (2, 0), and B= (1, x) for x < 1. See figure 1. The 
three sides have “lengths” dac = V2? — 02 =2 and dag =dgc = V1 — x2. Notice that 


* Perhaps a more compact word is “zaum,” made up from the German words “Raum” and “Zeit,” which form 
the title of the classic book by Hermann Weyl on space and time. 

T Imagine telling Pythagoras that time has something to do with flipping a sign in his magical formula. You 
would have been certified as a total nut. We now know that time differs from space by a sign, but that hardly 
means we understand time. Physicists’ time is not the same as psychological time, whatever that is. 
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Figure 1 A straight line between two points in 
spacetime is not necessarily the path of shortest 
distance. In the triangle shown, the sum of the 
length of the two sides, AB and BC, is less than 
the length of the side AC. 


dap + dpc = 2V1— x2 is always less than dyc=2. Indeed, as x — 1, the distances 
dap = dpc approach 0, becoming null or lightlike. That’s what a minus sign can do for you! 

This little example captures quite a bit of the geometry of Minkowski spacetime, as we 
will see. In exercise 10, you will generalize this example. 


“A stubbornly persistent illusion” (?) 


With this most valiant piece of chalk | might project upon the 
blackboard four world-axes. Since merely one chalky axis, as 

it is, consists of molecules all a-thrill, and moreover is taking 
part in the earth’s travels in the universe, it already affords us 
ample scope for abstraction; the somewhat greater abstraction 
associated with the number four is for the mathematician no 


infliction. . .. Then we obtain, as an image, so to speak, of the 
everlasting career of the substantial point, a curve in the world 
a world-line. . . . The whole universe is seen to resolve itself into 


similar world-lines, and | would fain anticipate myself by saying 
that in my opinion physical laws might find their most perfect 
expression as reciprocal relations between these world-lines. 


—Hermann Minkowski? 


That piece of chalk was certainly valiant. In Minkowski spacetime, time and space are 


distinguished only by a sign, but what a sign! No doubt one of the most significant* in all 
of physics. 


* Like objects repel in electromagnetism and attract in gravity. This amazing fact, which to a large extent is 
responsible for the physical world as we know it, can be explained in terms of this sign. See QFT Nut, p. 37. 
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> X 


Figure 2 The worldlines of several particles. 


Since an event is specified by where and when it happened, we call a point in spacetime 
an event. As a particle moves about in spacetime, it traces out a curve known as the 
worldline. (See figure 2, in which we see the worldlines of several particles, including 
one at rest.) 

The distinguished physicist George Gamow wrote a charming autobiography attractively 
titled My World Line. You can imagine the world as consisting of a tangle of worldlines, 
with some coming together and intertwined with one another for a while and then moving 
apart. Somehow, we can experience this tangle only one time slice at a time. 

This picture of a tangle of worldlines, perhaps with a reality that goes beyond time, 
has prompted many pseudo-philosophical ramblings. Einstein himself gave in to this 
temptation. After the sudden death of his school friend Michele Besso, who had helped 
him understand time in special relativity, Einstein wrote, only weeks before his own death 
(as it would turn out) to Besso’s son: “Now he has departed from this strange world a little 
ahead of me. That signifies nothing. For us, physicists in the soul, the distinction between 


past, present, and future is only a stubbornly persistent illusion.”* 


Light cone 


Light propagates at the speed of light, that is, with dt? = dx* + dy? + dz’, as already 
mentioned above. Thus, light rays emitted from the origin of spacetime span a cone, 
known as the future light cone, in Minkowski space, defined by t* = x* + y? + 2? and 
t > 0. Similarly, light rays that reach the origin span the past light cone defined by t? = 
x24 y? +42? and t <0. (See figure 3, in which we have to suppress the z-axis.) Note that 
every point in spacetime has its own future and past light cones. Light cones everywhere! 

Since a material object can’t move faster than c, its worldline is subject to the constraint 
dt® > dx? + dy* + dz’, with the equality allowed only if the object is massless. In other 
words, at all points along the worldline of a massive particle, its slope has to be greater 
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future light cone —— on 


> X 


past light cone —_ 


Figure 3 The future and past light cones. 


C 
B 
A 


Figure 4 A particle must move inside its future 
light cone at all points along its worldline. 


than 45°. The particle must move inside its future light cone at all points, as indicated in 


figure 4. 


Causality thus states that what happens at a point O in spacetime (see figure 3) can 


only influence what happens inside its future light cone but not what happens outside 


its future light cone (such as the event A in figure 3). Similarly, only events that occur 


inside its past light cone (such as the event B in figure 3) can influence what happens 


at O. 


Note that if we restore c, the light cone flattens out as c > 00, so that the future light 


cone encompasses all of future t > 0, and we are back to the Galilean view of space and 


time. (In figure 5, call where we are sitting the origin of spacetime; then the entire shaded 


region is in our past “light cone.”) 
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Figure 5 The Galilean limit of the past light cone. 


Proper time 


Consider two events A and B, infinitesimally separated, on the worldline shown in figure 
4. Our two observers Ms. Unprime and Mr. Prime agree on the Minkowskian distance 
between A and B, namely dt? — dx? — dy* — dz* = dt” — dx” — dy” — dz’. That’s the 
whole point of special relativity! I will go almost painfully slow here for reasons that will 
become clear. First, let the figure shown be the one Ms. Unprime would actually draw, 
using her clocks and rulers. Mr. Prime would draw a different, but analogous, figure (which 
we are not showing) using his clocks and rulers. 

Suppose the worldline in figure 4 is actually that traced out bya Dr. D, using coordinates 
t"", x", y”, and z”. Since the worldline is curved, Dr. D is actually accelerating this way and 
that, definitely not an inertial kind of guy. To Dr. D, the spatial separation between A and B 


is given by dx” = 0, dy” = 0, dz” = 0, of course. You are not going anywhere in your rest 
frame, by definition. 

Now, special relativity informs us that the quantity dt”* — dx’* — dy’? — dz’? = dt? — 
dx* — dy? — dz* = dt” — dx’ — dy” — dz’ is the same for all observers. Thus, it makes 
sense for Ms. Unprime to define 


dt” =dt* — dx? — dy* — dz? (2) 


Since dx” = 0, dy” =0, dz” = 0, we have dt? = dt” — dx!" — dy'? — dz’? = dt'”, so that 
we can interpret dt as the actual biological time elapsed between A and B as experienced 
by Dr. D, were he or she a biological organism. We call dt the proper time interval* for 
Dr. D. 

Notice that dt is not the proper time lapse as experienced by Ms. Unprime. Nor is it the 
proper time as experienced by Mr. Prime. The point of special relativity is that Ms. Unprime 
and Mr. Prime agree that dt is the proper time experienced by Dr. D. 


* The term “proper” time is meant to refer to the time felt by Dr. D him- or herself, but to me hints of a lesson 
in etiquette. A better term would have been “eigenzeit” or “eigentime,” as in “eigenvalue” or “eigenvector.” 
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Distance measure in spacetime 


The proper time naturally provides a “distance” measure in spacetime along an observer’s 
worldline. What is the proper time experienced by Dr. D between two events A and C far 
apart on his or her worldline? In physics, we assume that time is additive, and so we simply 
sum up the infinitesimal proper time lapse to obtain /, he dt. The statement, inherent in 
Riemannian geometry and Einstein gravity, is that space and time can be experienced in 
infinitesimal bite-sized segments. I don’t see how we could even do physics without this, 
but of course it is still an assumption. 
We have (compare this with the length of a curve given in (I1.2.2)) 


[anf var—ae= f° dt, | (#) - (<) a fap-(8 =) (3) 


I have purposely given four different expressions for the spacetime distance between A 
and C. The second expression emphasizes that it is completely parametrization invariant. 
The third expression uses the proper time itself as the integration parameter. The fourth 
expression uses coordinate time as the parameter. Whose coordinate time? Ms. Unprime’s. 

The fourth expression in (3) also explains our observation about the triangle in figure 1. 
The line AC is in fact the path of longest distance between A and C, since dx = 0 maximizes 
the square root in that fourth expression in (3). Any curve (as long as it doesn’t have any 
spacelike segment, for which the proper time interval in the integrand would be imaginary) 
joining A and C will be shorter in length, thanks to the minus sign in (3). 

One must exercise considerable care in looking at spacetime diagrams such as those in 
the next chapter. It is easy to fall into the trap of thinking Euclidean. 


Motion of a free particle 


Students sometimes feel that the equation of motion of a particle can be derived somehow. 
To be contrary, it has to be abstracted from empirical observations and then enunciated 
by some great physicist, made great by said enunciation. Newton enunciated that, in the 


abaenice of pane fre a pence will maintain a constant velocity In particular, if the 


dPx 
dt? 


= 0, or in other 


words, X stays delice 

How does a free particle move in special relativity? Again, the answer had to be enunci- 
ated by Einstein and then verified by experiments. But since Einstein came after Newton, 
the postulated equation must reduce to Newton’s equation for a slowly moving particle. In 
addition, there is the very stringent requirement that Ms. Unprime and Mr. Prime have to 
agree that the particle is free. Both of these requirements are satisfied by 

d?xH 

dt? 


=0 (4) 
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Here X“(r) = (Xr), X‘(t)), with i =1, 2, 3, denotes the location of the particle in 
spacetime as measured by Ms. Unprime, and t is the proper time determined by dt = 


\/ (dX°)2 — (dX). To see what equation Mr. Prime subscribes to, we simply note that X’ 


is related to X by a linear transformation, X’“ = A‘ X”, and so 


qd2X'# d2X” 
= AY =0 (5) 
dt dt 


and that dt is the same for Ms. Unprime and Mr. Prime. Note that all we need here is 

that the relation between X’ and X must be linear and that A“, must depend only on the 
. . : da 

relative velocity between the two observers, so that it can pass through +> untouched. 

We will come back to the issue of how physical quantities observed by Ms. Unprime and 


Mr. Prime are related in chapter III.6. 


The Minkowskian metric for spacetime 


Clearly, a more compact notation would be desirable. Let us write dt* = dt* — dx* — dy? — 
dz? = —n,,,dx"dx”, with x° =1 and n,,, defined by 


no=—1, m1=N2=733=4+1, and ny=0 ifuAv (6) 


As always, the Einstein summation convention holds unless stated otherwise. 

You should be reminded of the distance squared between two nearby points in generi- 
cally curved spaces ds* = g,,,dx/dx”, with the metric g,,, that we studied in chapters I.5 
and I.6. We may regard 77,,, as the flat Minkowskian metric of spacetime, just as we regard 
6J as the Euclidean metric of ordinary flat space. Geometry was originally the science of 
measuring the earth; here we are measuring spacetime. 

From here on, the discussion parallels completely the discussion of rotation in chapter 
1.3 and of curved spaces in chapters 1.5 and 1.6, except that 7,,, replaces 5" and g,,,, 
respectively. Inevitably, I will repeat some of the earlier discussions, as I will be talking 
about vectors and tensors, upper and lower indices, all that good stuff. Given the confusion 
that some beginning students have, I feel quite strongly that some repetition is worthwhile. 
Indeed, my pedagogical strategy in this text is to proceed as follows: 


rotation > coordinate transformation — curved space > Minkowskian spacetime 


— curved spacetime 


These five topics can, and should, be treated as an organic whole. 

As in our earlier discussions, dx = (dt, dx, dy, dz) defines the basic or ur-vector. A 
vector p“ in spacetime is defined as a set of four numbers p = (p®, p!, p?, p>) that 
transform in the same way as dx“ transforms under the Lorentz transformation. It is 
sometimes called a 4-vector to distinguish it from ordinary 3-vectors in space. Evidently, 
the 4-vector p“ contains the 3-vector p' = (p!, p?, p?). 

The square of the length of the 4-vector p is defined as p? = p- p =n,» p"p”. (The 
dot will be often omitted henceforth.) Indeed, just as rotations can be defined as linear 
transformations that leave the lengths of 3-vectors unchanged, Lorentz transformations 
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are defined as linear transformations that leave the (Minkowskian) lengths of 4-vectors 
unchanged. In particular, dt* = n wvdx"dx” is left invariant. 

For two arbitrary 4-vectors p and q, consider the vector p + aq (for a an arbitrary real 
number) and its length squared p? + 2ap -q + aq”. Since a is arbitrary and since Lorentz 
transformations leave lengths unchanged, they also leave the scalar dot product between 
two 4-vectors 


0.0 1,1 2,2 
Pd =NwP"g’ =—p'g? + p'g'+ pq? t pq? (7) 


unchanged. (Recall that we used a similar argument in chapter I.3.) 


Indices upstairs and downstairs 


Earlier, in our discussion of curved surfaces in chapter I.6, I snuck in lower indices by 
writing the indices on g,,, as subscripts. Here, I have done the same, writing 7,,, as an 
object carrying lower indices. 

Thus far, 7),,, is the only object with lower indices. When we want to sum over two indices 
wand v, the rule is that we multiply by 7,,, and invoke Einstein’s repeated summation 
convention. We say that we have contracted the two indices. For example, given two 
vectors p” and q", we might want to contract the indices ~ and v in pq” and obtain 
P*4=NyvP"q’. Another example: given p“q’r?s°, suppose we want to contract jz with 
p. Easy, just write noMypp"q'r?s® = (p-+s)(q +r). Savvy readers will recognize that I 
am going painfully slowly here for the sake of those who have never seen this material 
before. 

So far so good. All vectors carry upper indices, and the only object that carries lower 
indices is 7. 

The next step is purely for the sake of notational brevity. To save ourselves from con- 
stantly writing the Minkowski metric n,,,, we define, when we are given a vector p“, a 
vector with a lower index 


Pv = NuvP” (8) 


In other words, if p“ = (p®, p) then p,, = (—p®, p). Thus, p-q = pyq" =—p°q® + p-q. 
(Notice that the same dot in this last equation carries two different meanings: the scalar 
product between two 4-vectors on the left hand side and between two 3-vectors on the 
right hand side, but there should be no confusion.) With this notation, we can write 
P°9 =P," = p'4,. Similarly, an expression 7), p"q' N75 can be written more simply 
as p,q’r,s°. The Minkowski metric has been folded into the indices, so to speak. 


Just a convenient notation 


Unaccountably, some students are twisted out of shape by this trivial act of notational 
sloth. “What>?” they say, “There are two kinds of vectors?” Yes, fancy people speak of 
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contravariant vectors (p“ for example) and covariant vectors (p,, for example), but let 
me assure the beginners that there is nothing terribly profound* going on here. Just a 
convenient notation. 

Let us immediately clear up some potential questions about the notation. Some students 
have asked why there isn’t a distinction between upper and lower indices for ordinary 
vectors. The answer is that we could have, if we wanted to, written the Euclidean metric 5;; 
with lower indices back in chapter I.3 and risked confusing the reader at that early stage. 
But there is no strong incentive for doing that: the Euclidean metric does not contain any 
minus signs, while the Minkowskian metric necessarily has one negative sign and three 
positive signs to distinguish time from space. The upper and lower index notation serves 
to keep track of the minus signs. In the Euclidean case, if we define p; = 5;; p/, the vector 
p; would be numerically the same as the vector p’. In Minkowski space, p, = p', p> = p’, 
and p3 = p*, but py = —p®. 

The next question might be: given p,,, how do we get back to p? 

Here is where I think beginners can get a bit confused. If you have any math sense at all, 
you would expect that we use the inverse of 7, and you would be absolutely right. Surely, 
if you use 7 to move indices downstairs, you would use the inverse of 7 to move them 
upstairs. But the inverse of the matrix 


-1 0 0 0 
0 +1 0 =O 
0 oO +1 #0 
0 oO O +1 


is itself. So traditionally, the inverse of 7 is denoted by the same symbol, but with two upper 
indices, like this: n“”. We define n”” by °° = -1, n!! = n?? = 733 = 41, and n“” = 0 if 
FY. 

Indeed, 7!” is the inverse of 7, regarded as a matrix: ""n,, = 6’, where the Kronecker 
delta 5," is defined, as usual, to be 1 if 4 =A and 0 otherwise. It is worth emphasizing 
that while n”” and 7,,, are numerically the same matrix, they should be distinguished 
conceptually. Let us check the obvious, that the inverse metric n”” raises lower indices: 
n*” py = nny, p* = 6, p* = p". Yes, indeed. 

Confusio: “Ah, I get it. The same symbol n is used to denote a matrix and its inverse, 
distinguished by whether 7 carries lower or upper indices.” 

From this we see that the Kronecker delta 5," has to be written with one upper and one 
lower index. In contrast, n/' does not exist. And there is no such thing as 6“” or 6,,,,. Also 
note that the Kronecker delta 5 does not contain any minus signs, unlike the Minkowski 
metric 7. 

— dx? 


It follows that the shorthand 9,, for gon has to carry a lower index, because 0,,.x” = 547 = 


oe In other words, for the indices to match, d,, must be written with a lower index. This 


* Of course, if you woke up one day and discovered that you were a mathematician or a mathematician-want- 
to-be, you should and could read more profound books. 
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a 
OxF? 
denominator, so to speak. We will use this fact repeatedly. Once again, we already know 


makes sense, since the coordinates x“ carry an upper index but in it appears in the 
this from chapters I.5 and 1.6. 

Let me emphasize again an extremely useful feature of this notational device. In the 
Einstein convention, a lower index is always contracted with an upper index (that is, 
summed over), and vice versa. Never never sum two lower indices together, or two upper 
indices together! If you ever encounter an expression in which two lower (or two upper) 
indices are summed over, you know that there is a mistake* somewhere. 

At the risk of repeating the obvious, remember, there is nothing profound’ going on 
here. The whole business of introducing upper and lower indices is just for notational 
convenience. You are to read (8) as merely a definition introduced to save writing. 


Spacelike and null surfaces 


Now that space is married to time, what do we mean by the term “space”? Evidently, a 
t = constant slice of Minkowski spacetime can be regarded as space. But the time and 
space axes of another observer would in general be tilted with respect to yours, so a tilted 
slice should also count. Such considerations lead us to generalize the concept of space 
to spacelike surfaces to be defined below. Well, we learned how to characterize surfaces 
embedded in Euclidean space in chapters I.6 and I.7, defining tangent vectors lying in the 
surface and normal vectors (usually only one of them) perpendicular to the surface. Here 
we do the same for surfaces embedded in Minkowski spacetime. 

So, a surface in Minkowski spacetime consists of the set of points satisfying some 
(reasonable) equation of the form F(x°, x!, x2, x3) =0 (a special case of which is x° = 
f (x, x*, x3)). Here the word “surface” is used in a generalized sense, not necessarily a 2- 
dimensional surface of the kind we encounter in everyday life. The Jargon Guy yells, “Call it 
a hypersurface!” but we ignore him. A surface is called spacelike if the separation between 
any two infinitesimally nearby points on the surface is spacelike. The three tangent vectors 
are everywhere spacelike. The normal to the surface is then a timelike vector. Notice that 
this definition allows for the possibility that the surface is curved. Thus, when we get to 
curved spacetime in part V, the same definition is still serviceable and provides what we 
mean by “space.” 


* In practice, this evident truth is used as follows: people can afford to be sloppy in intermediate stages of a 
calculation, but then they move indices up and down at the end to satisfy this rule. 

+ Once, when I taught special relativity, I surveyed the students to find out what, if anything, they found 
confusing or deficient in the textbook I used. One student, a kind of super-Confusio, told me that the textbook 
never explicitly said that p,,q" and p,q* are actually the same and it took him, poor fellow, a long time to figure 


it out. Let it be recorded that this textbook explicitly states that ppg + pyq'+ poq? + p39? = Dy Ped = Po? = 
NopPPd? = PN? = PP Np I? = PPAp = pegs = p,q” (with the last choice, although perfectly okay, not gen- 
erally advisable, given the cultural baggage® associated with 2). Got that? The so-called dummy summation 
variables are just dummies to keep track of things. 
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Surfaces generated by light rays, not surprisingly, form another important class of 
surfaces known as null surfaces. Write Minkowski spacetime in spherical coordinates, 
with ds? = —dt* + dr? + r2d6? + r? sin? @dq”. In these coordinates, the metric is given by 
82 =—1, Ber =1, B99 =F, Bop =F? 
words, exactly the same as in chapter I.5 except for the addition of the time coordinate). Now 


sin? @, with all other components equal to 0 (in other 


consider the light cone defined by the equation t = r in these coordinates (t, r, 0, g) and 
formed physically by light rays going radially outward from the origin, along the trajectory 
(dt, dr, d0, dg) xl" = (1, 1, 0, 0). This tangent vector along the light cone is clearly null, 
since gf =e, +2,07 =—)+1=0, 

The other two tangent vectors are given by 


h*=(0,0,r-1,0) and k“=(0,0,0, (rsin6)~} 


(We have normalized them to g,,,h“h” = Land g,,,k"k” = 1.) Evidently, they are spacelike 
and orthogonal to /*, since g,,,/“h” =0 and g,,,/“k” = 0. They are also mutually orthog- 
onal, since g,,,h”"k” = 0. These three 4-vectors /, h, k furnish the three tangent vectors to 
the surface. 

Now comes the fun part. What is a 4-vector normal to the null surface? The answer is 
evidently / itself! Since g,,,/“/” =0, 1 is “Minkowski perpendicular” to itself, and further- 
more, to # and k. In other words, / is Minkowski perpendicular to all three tangent vectors 
lying in the null surface. 

We could utter the following peculiar-sounding statement: the normal to a null surface 
is a null vector that lies in the surface. Minus signs could do “funny tricks” for us that 
Euclid never dreamed of! 

The null surface of the light cone has another peculiar property: it’s a “one way” surface. 
Think of a massive particle moving along a timelike worldline. Once it enters into a given 
light cone, it can’t get out again. If we think of the null surface as a membrane, it is 
permeable only in one direction just like a certain hotel (in California): you can check 
in, but you can never leave. 


The relativistic Doppler shift 


As a simple application of the 4-vector formalism and as a break from a rather for 
mal discussion, let’s derive the relativistic Doppler effect. Consider an electromagnetic 
wave observed by Ms. Unprime and described schematically by cos(wt — k -X) = cos kx = 
COs 1),,,k"x", where we have defined the 4-vector k = (@, k) with the circular frequency w 
and the wave vector k. As usual, w? = k*. The physical requirement k’x’ = kx is satisfied 
if k transforms like x, that is, like a 4-vector. 

What is the frequency and wave vector observed by Mr. Prime? The answer, namely the 
relativistic Doppler formula, follows almost instantly from the Lorentz transformation: 
wo! = (o + uk,)/V1— W?, ki = (ky, + ue) /V1— W?, ki, = ky, kl = k,. We obtain thus 


wo =o(1+ucos6)/V1— u2 (9) 
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where 6 is the angle between k and i. As the train is approaching Mr. Prime, 6 ~ 0, 


wo ~o * and so the frequency observed by Mr. Prime increases; in other words, it 


1l-—u 
1+u’ 


Compare this derivation with the elementary nonrelativistic discussion involving, if the 


is blue shifted. As the train recedes, 0 ~ 7, 0’ ~w and the frequency is redshifted. 
source is receding along the line of sight, the extra distance the next crest of the wave has 
to travel and so on and so forth. Note that we have an extra relativistic factor of 1/1 — u2, 
which we will identify in the next chapter as due to time dilation. 


One unified language 


We derived the Lorentz transformation in the preceding chapter, but it is instructive to 
derive it again. Let p and q be two arbitrary 4-vectors. Consider the linear transformation 


pean yp? and .g? a At g? (10) 


Notice the upper-lower summation convention. For A to be a Lorentz transformation, we 
require p’-q’ = p-q, thatis, 


Pd’ = My Pg” = ny yA", Po AY a? = Pd = NopP a? (11) 
Since p® and q? are arbitrary, A must satisfy 
TNywA' A, = Nop (12) 


Just as we determined rotations as transformations that left p-q invariant (in chapter 
1.3), here we determine Lorentz transformations as those transformations that leave p - q 
invariant. You also may recognize that A” is playing the same role as S“ in chapter I.5. 
Indeed, in parallel with the discussion there, let us define the transpose by (A’) “= AM” 
(note the position of the indices!), so that we may write (12) as (AT) Py yA”, = Nop» OF 
more succinctly, A? nA =n. 

I find it rather pleasing to have one unified language to describe four apparently different 
subjects: rotation, change of coordinates, flat space, and Lorentz transformation. As I 
have alluded to and as you will soon see, the same language is used in studying curved 
spacetimes. 

For the sake of the super-Confusio mentioned in the preceding footnote and first alluded 
to in chapter 1.6, let me stress once again that repeated indices are summed over and so 
can be represented by any letter we wish, as long as it’s a letter in whatever alphabet you 
are using that we haven't yet used in the same expression. For instance, we could write 
(11) just as well as p’- gq! =nyyp'?q’” = Nov A* PYM” gd" = P+ 4d = New Pg". Since there 
are only so many letters in the alphabets commonly used in physics, you will often see 
the same expression written (as in the example here) with completely different letters, 
particularly when we get to general relativity. 
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Lorentz algebra 


Following Sophus Lie once again, we consider an infinitesimal transformation A“ ~ 
6% + ok (with the Kronecker delta defined earlier), just as in chapter I.3 we considered 
an infinitesimal rotation R ~ J + 67. As indicated above for Cues we should also define 
(KT) # = K* . Inserting this infinitesimal transformation into (12), we obtain, to leading 
order in ¢, 


KK tip + NovK”, =0 (13) 
which we can write as 
K'n =—nK (14) 


We are to solve (14) for the unknown 4-by-4 matrix K, but actually this problem involves 
only 2-by-2 matrices. Consider a boost in the x direction. Since y’ = y and z’ = z, there is 
no point in dragging them around, and we can focus on the 2-dimensional space spanned 
-1 


by t and x, so that effectively n = 
yf and x, SO at Errectively 7) & 41 


) and we are to solve (14) for the effectively 


2-by-2 matrix K. The solution is 


K ie 15 
-(5 3) as) 


(If you have taken a course on quantum mechanics and know what Pauli matrices o; are, 
you can see that K and —7 are just o, and 03, the first and the third Pauli matrix, respectively. 
Finding the solution is a snap. Noting that o, is symmetric and that it anticommutes with 
03 gives us (15) immediately.) 

You should appreciate, exactly as in chapter I.3 for rotations, how easy it is to solve the 
Lorentz condition for infinitesimal boosts. To leading order in y, with tr’ ~t+ yx and 
x’ <x + ot, we have t” — x’? ~~ t? — x? + 2g(tx — xt) =t? — x?. 

Note a crucial difference between infinitesimal boosts and rotations: K is symmetric, 
while in contrast, 7 is antisymmetric (see 1.3.7). Indeed, if we replace n > 1, K > J in 
(14), we obtain hd ele ted ie 

As Lie had assured us, just as in our discussion of rotation, once we have determined the 
infinitesimal boost, we can generate finite boosts by repeatedly boosting by an infinitesimal 
amount. For finite and N large, write A(S) ~I+ £K ~ et. For a finite boost, then 
A(@)= (A(E))% = eX, Expanding the series, we obtain 


A@) =e =) g)"K"/nl= (>: #00!) I+ (> gt (2k + 0) K 
n=0 k=0 k=0 
=coshg/+4+sinhgK 


coshg sinhg 
7 (16) 
sinhg coshge 
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Note that we used K? = 1 in crucial contrast to 7? = —1 for rotation. Thus, we obtain 

t'=coshgt+sinhg x 

x’=sinh gt +coshgx 

yoy 

z=z (17) 
with y and z brought back in. 

And thus, once again we have derived the Lorentz transformation. I hope that you 
appreciate how elegant Lie’s method is compared to the brute force method we used in 
the preceding chapter. 

For two successive boosts in the x direction, A(g1) A(¢2) = A(g; + ¢). The parameter 
¢, sometimes called rapidity, combines additively. This also implies that A~'(g) = A(—¢). 

Also, (13) can be rearranged as ,,,K% 1°" = -KY, which implies that ne?n = e~* 
and hence (since A = e?*) n,,,A% 7°" = (ATOM We could of course verify this explicitly 


o 


using (16): 
-1 0 coshg sinhg -1 0 coshg -—sinhg 
= 18 
0 +1 sinhg cosh 0 +1 —sinhg cosh — 
We note for future use that since cosh” y — sinh? y = 1, the Jacobian det A for Lorentz 
transformations, just as for rotations, is equal to 1. Thus, d*x’ = d*x: spacetime volume 


is unchanged. 
It is instructive at this point to work out how p,, transforms: 


P= hp Ph = Hyp AXP” Spy AEH pp (AE Dp (19) 


Thus, while a vector with an upper index transforms with A, a vector with a lower index 
transforms with A~!. They transform oppositely. 

Note carefully the locations of various indices in the above discussion. Note also from 
(16) that A and A~! are symmetric as matrices. As an exercise, we could verify explicitly that 


a. = 3¢z transform like a vector with a lower index, as we argued earlier. Since x“ = AM x’, 
Vi —lyv y/o Po 28 Oe cd —1ly)v 
we have x” = (A~*) oe and hence a, = yn = egg = (A) nov 


Lorentz tensors 


Henceforth, physics is required to be invariant not only under rotations but also under 
boosts. (Clearly, in addition to the boost along the x direction displayed in (17), we can 
also boost along the y and z directions.) The set of all rotations and boosts is known as the 
Lorentz group, which we will discuss in more detail in appendix 1. The rotation group is 
evidently a subgroup of the Lorentz group. 

The concept of tensors as discussed in chapter I.4 can be immediately generalized. 
Mimicking the discussion for the rotation group, we immediately define a tensor with 2 
upper indices T“” to be something that transforms as T“” > T'#” = A¥ AY T°°. The 
earlier discussion goes through; in particular, the symmetric and antisymmetric parts 
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(S4°= THY +7" and At? = TH — T’) of T#” transform separately, furnishing a 4 - 
5/2 = 10-dimensional and a 4 - 3/2 = 6-dimensional representation, respectively. Just as 
vectors can carry lower as well as upper indices, so can tensors. We may lower and raise 
indices at will, using ,,, and n“”, respectively. For example, Ee = Ny fee 


Euclid did not forbid you to study curves 


It is worthwhile to underline a deep-seated but common misunderstanding that exists 
among some students of special relativity. The subject is concerned with the physics seen 
by two observers in uniform motion relative to each other. Absolutely nothing says that 
the objects* they are studying have to move at constant speed. The confusion appears to 
stem from thinking that special relativity is only capable of dealing with objects that do 
not accelerate and that you need general relativity." 

Put differently, special relativity teaches us how dx and dx" are related, but nothing 
d2x'# d2xt 
d 


“y and ©*; vanish. 


There are actually misguided people walking around talking about the twin “paradox.” 


in the Lorentz transformation requires that 


A guy takes off in a rocketship while his twin stays home. When he comes back, he finds 
that the stay-at-home twin has aged a lot more.° 

So? The twin “paradox” is resolved by pointing out that it is not a paradox at all.” In 
ordinary space, nobody claims that the lengths of different paths connecting two points 
are necessarily the same. A guy drove from Los Angeles to San Francisco via Las Vegas. His 
twin drove directly from Los Angeles to San Francisco. When they met, the guy who went 
to Las Vegas found that he had burned up more gas than his twin did. That is no more a 
paradox than the twin paradox is a paradox. The lengths of different paths connecting two 
points in spacetime can of course be different. Indeed, it would be quite amazing if the 
lengths of entirely different paths connecting two points in spacetime turn out to be the 
same. If the twins were the same age when they met, now that would be quite a paradox 
indeed! 

At big accelerators, unstable particles zip around the ring at speeds close to c. A particle 
of the same type sitting in the lab has long decayed, while its “twin” is still speeding around 
the ring. That the twin paradox is not a paradox is a solid experimental fact that has been 
verified countless times. 

In the twin paradox, two observers (Ms. Unprime and Mr. Prime) in relative uniform 
motion observe the two twins. Notice that the stay-at-home twin is not required to sit still. 
What special relativity requires is that the two observers agree on the proper time that has 
elapsed for the stay-at-home twin when his wandering sibling returns. The two observers 


* The theory of special relativity does not care whether these objects are animate or not: they could be charmed 
mesons or other observers. 

+ This is manifestly untrue: accelerators accelerate particles, but to master particle physics, students do not 
necessarily have to become proficient in general relativity. 
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also must agree on the proper time that has elapsed for the wandering twin. What is not 
required at all is that these two proper times agree. 

Perhaps I should say something even more obvious. Euclidean geometry is left invariant 
by rotations, but this does not mean we are allowed to study only straight lines in Euclidean 
space. We are of course also free to study curves. What Euclid requires is that two observers 
whose coordinates are related by a rotation must agree on the length of a given curve, but 
it would be absurd to insist that they proclaim all curves to have the same length. 

So, let me say it again: just as Euclid did not forbid you to study curves in his space, 
Einstein did not forbid you to study curves in his spacetime. Indeed, the equation of motion 
of a free particle (4) may be immediately generalized to that of a particle acted upon by an 
external force: 


d*X" 
m = Fe 20 
7) (20) 


This is just Newton’s law ma = F promoted to Einstein’s world. (We will talk a lot more 
about promotion in chapter III.6.) The requirement that Ms. Unprime and Mr. Prime 
subscribe to the same equation means that the force has to be promoted from a 3-vector 
F to a 4-vector F“, so that Mr. Prime would see the force F’! = AMF’. (We will see an 
explicit example of this when we discuss electromagnetism in chapter IV.1.) 

To underline the fact that special relativity can be applied to observers undergoing 
arbitrary accelerations, I will let you prove a basic fact about acceleration in Minkowski 
spacetime, calculate how a constantly accelerating particle would move, and develop the 
concept of Fermi-Walker transport in the exercises. 


Lorentz, Poincaré, and Einstein: “to not trouble . . . old habits” 


The intellectual history of special relativity is exceptionally fascinating because, in contrast 
to general relativity, which was born largely through the labor of a single man, so many 
great minds participated in developing special relativity, with several coming to within 
a whisker of the final theory. Henri Poincaré in particular developed the Lorentz trans- 
formation into the form now known to us, but without taking that final leap of forcing 
the mathematics on physics.* Many felt that® perhaps Einstein got too much credit and 
Poincaré too little. The French physicist Thibault Damour had examined this point in depth 
and concluded that the cartoon history, giving Einstein most of the credit, is largely cor- 
rect.? I think that Lorentz and Poincaré quite simply did not enjoy the boldness of youth: 
in 1905, both were in their early 50s, while! Einstein was 26. Indeed, months before his 
death in 1912, Poincaré wrote: “Today some physicists want to adopt a new convention. It 
is not that they are forced to; they judge this new convention to be more useful, that is all; 
and those who are not of the same opinion may legitimately keep the former convention 


* Some theoretical physicists think that the pendulum has perhaps swung to the other extreme: these days, 
the leap may be leapt before doing anything else. 
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in order to not trouble their old habits. I think, between us, that this is what they shall do 


for a long time.”!" 


I trust that you, the astute reader, after getting to this point, could explain to Poincaré that 
the work of Einstein and Minkowski (whom he did not refer to by name) was not merely a 
convention one could choose not to adopt. The telling phrase here, and a warning to you, 
is “to not trouble their old habits.” So reader, whatever your age may be, remember “the 
boldness of youth.” 

I close with a “pregnant” quote from Minkowski: “The essence of this postulate may 
be clothed mathematically in a very pregnant manner in the mystic formula 3 - 10° km = 
/—1secs.”!2 I like his choice of words. 


Appendix 1: Generalized “rotation” groups 


Looking at the boost in the x direction (17), you might have been struck by its uncanny resemblance to a rotation 
in the (t-x) plane, except that cosine and sine have been replaced by their hyperbolic counterparts and that a 
minus sign has disappeared. Surely, this is not an accident. Indeed, recall from chapter 1.3 that the rotation 
group in SO(D) is defined as the set of all linear transformations dx’ = Rdx on a collection of D real variables 
(dx!, dx*,---, dx), such that the quadratic form ds* = ye? (dx!) is left unchanged and det R = 1 (this 
specifies the S in SO(D)). 

Our experience with Minkowski geometry warmly invites us to generalize. Define the group SO(m, n) as the 
set of all linear transformations dx’ = RdX ona collection of D real variables (dx!, dx?, ---, dx'"*”), such that 
the quadratic form ds? = )7""_,(dx!)? — pa Coa = Nyydxdx” is left unchanged and det R = 1. (Here 7, 
denotes a generalized Minkowski metric, namely a diagonal (m + n)-by-(m +n) matrix with m (+1)s andn (—1)s 
along the diagonal. The indices ys and v range over 1, 2, --- , m +n.) These transformations form a group, since 
if R, and R, leave the quadratic form invariant, then R,R, also leaves the quadratic form invariant. The other 
defining requirements of a group are even more obviously satisfied. The two integers (m,n) are known as the 
signature of the group. (Incidentally, as mentioned in the text, quantities such as ds* and dr? are not necessarily 
positive.) 

Clearly, the groups SO(m, n) and SO(n, m) are the same: if d s* is left invariant, then so is —ds2. The Lorentz 
group is then simply SO (3, 1). The rotation groups and the Lorentz group can thus be studied in a unified fashion 
as special cases of SO(m, n). 

Again, the Lie algebra of SO(m,n) is obtained by studying the infinitesimal transformation R ~ J + 
i Vw 0" Ju», with real parameters 6“”, the analogs of the angles g and @ in our earlier discussions. You 
can verify that the entire discussion in appendix 2 to chapter I.3 and in this chapter can be repeated, with the 
Kronecker delta 6,,,, replaced by the generalized Minkowski metric n,,,. In particular, the commutation relations 
between the generators J,,,, can be carried over from (1.3.23): 


[Juv Joo] = iMupIve a Nya Fup = Mp uo — Nye Jip) (21) 


We specialize to the Lorentz group SO(1, 3) or SO(3, 1). Reverting to standard physics notation, we have 3 
boosts generated by K; = Jo; and 3 rotations by J; = 58; ikJ jk- We can read off from (21) the commutation relation 
between boosts and rotations: 


[Yx> Jy] = [Jo3, J31) = —i033421 = 112 = iJ, (22) 
(Je, Ky] = [Yi2, Jor] = —iniJ20 = iJon =i Ky (23) 
[Ky, Ky] = [Jor Joa] = +inoo412 = —i J; (24) 


All other commutation relations can be gotten from cyclically permuting the ones displayed here. 
The relation (22) generates the subalgebra corresponding to the rotation subgroup SO (3) of SO(3, 1), familiar 
from chapter I.3. The relation (23) tells us that the 3 boost generators (K,, K y Kz) transform as a 3-vector under 
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the rotation group, exactly as you would expect. (To see this, consider K,(0) = e~'°": K,e'°¥:. Differentiating, we 
obtain 


dK,(0)_, ; : ua ; 
7 = OFe(_i LJ, K, Je?” ay (O41. Kel Ot: = K,@) 
Similarly, er = —K,(6). Solving, we obtain K,(0) = cos 6 K, +sin@ K, and K,(6) = — sin@ K, + cos6 K,,.) 


The third relation (24) is the most interesting: it shows that successively boosting in the x and y directions can 
produce a rotation! about the z-axis. 

This set of commutation relations underlies many interesting developments in quantum field theory, such 
as the Dirac equation, the Weyl equation, parity violation in the weak interaction, helicity spinors, and twistors, 
to name a few. 

We also note that the differential representation of the rotation generators mentioned in chapter I.3 may be 
immediately promoted to the differential representation of the generators of SO(m, n) 


J 


uy = H(%,,d, — X,9,) (25) 


Note that since J,,, is defined with two lower indices, our index convention requires x,, = ,,)x", rather than 
x", to appear on the right hand side. Thus, in calculating the commutator in (21) when we push a 4, = so, for 
example, past x,, = ,,,«", we will get ,,,. This is another way of seeing why the generalized Minkowski metric 
appears in (21). 

Finally, we check explicitly that (17) analytically continues to a rotation. Write t = x 
continue to x* and @ real. Then cosh y = Z(e% te?) > Z (cl? +e!) = cos@ and sinh yg = 5 (ef -—e’)ys> 


9 —ix* and gy =i6 and 


5 (el? — e!°) =i sin 0. Then the relevant part of (17) becomes (we write x! for x) 
x4 =cos@x*+sin6 x! 

x1=—sin@ x*+cosé x} (26) 

precisely a rotation in the (1-4) plane. The Lorentz group SO (3, 1) is intimately connected to the 4-dimensional 

rotation group SO (4). Clearly, linear transformations that leave (x°)? — ¥? invariant upon analytic continuation 

x9 > x4 leave (x4)? + ¥2 = (x})? + (x2)? + (x3)? + (x*)? invariant. Going from x9 to x4, known as Wick rotation, 

is a standard procedure in quantum field theory.> 


Appendix 2: From the Lorentz algebra to the Poincaré algebra 


The set of generators J,,, can be supplemented by the generators of translation P,, = id,,. We see that P,, generates 
translation by acting with it: (J — ia" P,,)x* =(4 a"d,,)x* = x* + a*. This is of course the same way we see that 


Juy generates rotations and boosts. (Even farther back, in chapter I.3, we saw that J, = i(y ZX —x 2) generates 
rotation about the z-axis: act with I + id J, on (x, y, z).) ; 
Thus, the Lorentz algebra defined by (21) can be extended to the so-called Poincaré algebra, generated by 


(Juv, Py). In addition to the commutation relation (21), we now have 
[Pur Prl=9, [Jus P,| = i(Nup Pv — Nyp Pu) (27) 
Exercises 


1 Write dt? in light cone coordinates. 


2 Just as we are allowed to change coordinates in Euclidean space and in curved spaces, of course we are 
also free to change coordinates in Minkowski spacetime. Consider, for example (after going to the usual 
spherical coordinates x =r sin 6 cosy, y=r sin @ sing, z=r cos 0), the transformation t = p sinh T, r= 
p cosh T, introduced by Rindler, with T ranging from —oo to oo and p ranging from 0 to oo. Show that 
dt? = dt? — dx? — dy* — dz* = p*dT? — dp” — p* cosh” T (dé? + sin? 4d¢°). For fixed 0 and g, the lines of 
constant p trace out hyperbolas in the (t-r) plane as T ranges from —oo to oo. Note that, since r > |r|, the 
coordinates (T, p, 6, y) cover only one quadrant or wedge of Minkowski spacetime (figure 6). 
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Figure 6 Rindler coordinates cover only one 
quadrant or wedge of Minkowski spacetime. 


As emphasized in the text, we can certainly describe accelerating particles in special relativity. Consider the 
: : 2 : 
4-velocity V4 = dae and the 4-acceleration a“ = ave = ae . Show that a,, V“ = 0, so that in the rest frame 


of the particle, a“ = (0, a). 


Show that the worldline of a particle with acceleration given by a,,a" = g’, with g a constant, is a hyperbola. 


Fermi-Walker transport: Consider an observer undergoing arbitrary acceleration, carrying with her a vector 
W". The vector W” is said to be Fermi-Walker transported if it varies along the observer's worldline 
according to 
we (Via —a"*V")W, (28) 
dt 
(a) Show that the velocity V“ is Fermi-Walker transported. 
(b) Show that ifU” and W” are both Fermi-Walker transported, the scalar product U,, W" is left unchanged. 
These results imply the physically sensible conclusion that an observer in an accelerating rocketship 
can perfectly well enjoy the benefits of having an orthonormal coordinate frame. Our observer can set up, 
at some proper time, an orthonormal coordinate frame consisting of her 4-velocity, namely the timelike 
vector V“ and three spacelike unit 4-vectors el (with a = 1, 2, 3), satisfying the orthonormal conditions 
Ca * &h = Sap, V + eq = 0, and of course V - V = —1. In her rest frame, V“ = (1, 0, 0, 0), and she can choose 
et = (0,1, 0,0), e = (0,0, 1, 0), ey = (0, 0, 0, 1). If she Fermi-Walker transports ee; then the result of 
this exercise guarantees that the orthonormal coordinate frame will remain orthonormal: the orthonormal 
conditions ¢, - €, = 54), V -€, =0, and V - V = —1are all preserved. 


Work out explicitly how the components F™ and F¥ of the antisymmetric tensor F“” introduced in chapter 
1.6 transform under a Lorentz transformation. 


Show that the signature is an invariant. (This was known in the 19th century as Sylvester’s law of inertia. 
Sylvester will appear again in chapter III.5.) 


Follow a boost in the x direction with a boost in the y direction. Take the infinitesimal limit and compare 
with (24). 


We observe an experimentalist moving by with 4-velocity uw“ and a particle zipping by with 4-momentum 
p". Show that magnitude of the particle’s 3-momentum as seen by the experimentalist is given by 


. 1/2 
lal =[(-w + wv] 
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Generalize the triangle shown in figure 1 by letting B = (t, x) for 0 < x <t, while keeping A = (0, 0) and 
C = (2, 0) fixed, so that all three sides are timelike. By symmetry, we can take 0 < t < 1. Show that dag + dpc 
is always less than dac = 2. This exercise can be interpreted as the twin paradox. One twin stays at home, 
while the other goes from A to B and then from B to C. By the way, although both in the text and here, the 
side AC is aligned with the time axis, the situation analyzed is actually more general, since by a Lorentz 
transformation, we can always bring AC to align with the time axis. 


Notes 


. It puzzles me somewhat that Lorentz and Einstein did not realize this. Poincaré apparently did. See 


T. Damour, Once Upon Einstein. 


2. L. Boroditsky, Cognition 75 (2000), p. 1. 
3. Translation of an address delivered at Cologne, 1908. Reprinted in The Principle of Relativity: A Collection of 


nun 


14. 
15. 


Papers by A. Einstein, H. Lorentz, H. Weyl and H. Minkowski, with Notes by A. Sommerfeld, Dover, 1952. 


. F. Dyson, Disturbing the Universe, Harper and Row, 1979, chapter 17. 
. See, for example, Fearful, p. 169. 
. Einstein had introduced in his 1905 paper what later became known as the clock paradox. The twins were 


introduced by P. Langevin in 1911. 


. This “resolution” of the twin “paradox” has already been emphasized in several well-known textbooks. See, 


for example, W. Rindler, Relativity, p. 77; J. B. Hartle, Gravity, p. 65. 


. Indeed, within days of writing this, I got into a heated argument on a social occasion with a Caltech physicist 


about this very point. I was arguing in Einstein’s favor. 


. T. Damour, Once Upon Einstein, p. 49. 
. Note in this connection that Newton was 24 in 1666, his miraculous year. 
. H. Poincaré, “Space and Time,” paper presented at a conference at the University of London, May 4, 1912 


(Scientia 12 (1912), p. 159 [in French]). 


. H. Minkowski, “Space and Time,” in A. Einstein et al., The Principle of Relativity. 
. This mathematical fact leads to the physical phenomena of Thomas precession and spin-orbit coupling in 


atomic physics. 
See, for example, S. Weinberg, The Quantum Theory of Fields; QFT Nut. 
For example, QFT Nut, pp. 12 and 23. 
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Scarcely anyone who truly understands relativity theory can 
escape this magic. 


—A. Einstein 


Events and worldlines 


Now that we have the Lorentz transformation* 
,_ EtvUXx 

aaa 
' x + ut 


eV a 


and its inverse (obtained instantly by flipping v to —v) 


t 


t' — vx’ 
V1— v2 
x’ — vt! 


er: " 


we are ready to work out all kinds of problems involving special relativity. Hopefully, a 


t= 


few examples will suffice to give you the idea of how to proceed in tackling this kind of 
problem. 

Many students, and not a few professionals, get easily confused by problems in special 
relativity. I recommend the following plodding, but almost foolproof, method. Make a list 
of all the relevant events and their given locations in spacetime. Recall from the preceding 
chapter that an event is specified by where and when it occurred, that is, by a point in 
spacetime. Even better, if necessary, work out the relevant worldlines. The locations of the 
relevant events are given sometimes in the primed frame, sometimes in the unprimed 
frame, and sometimes “in a mixture” with some events located in one frame and others 
located in the other frame. After all the locations are written down, then it is just a matter 


* For the relative velocity between the two frames, we have switched from uw, used in the preceding chapter, to 
v here. 
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Figure 1 The tick (B) and the tock (D) of a clock as seen by two different 
observers. 


of elementary algebra to work out the desired information using (1) and (2). You are free to 
call either frame the primed frame, and the other unprimed, whatever seems more natural 
or would make the arithmetic “look better.” Often the solution can be found more quickly 
by using a clever observation, typically based on invariance. This will be illustrated below 
in our examples. 


Time goes by 


Perhaps the most astonishing prediction of special relativity is that time flows at different 
rates for different observers. The discussion in the preceding chapters already implies this, 
but let us now work out the effect carefully. 

Provide two observers, Ms. Unprime and Mr. Prime, with identically manufactured 
clocks, going tick tock tick tock. Consider two spacetime events: Ms. Unprime’s clock 
ticks, an event we call B, then her clock tocks, an event we call D. Denote by T the time 
between tick and tock. We now write down the spacetime location of these two events with 
pedantic care: 


B: (t, x)g = (0, 0) 
D: (t, x)p = (T, 0) (3) 


Note that xg = 0 = xp: in the unprimed frame, the clock did not move (figure 1a). 
Mr. Prime sees Ms. Unprime go by with her clock. We simply plug in (1) to find the 
locations of these two events in the primed frame (figure 1b): 


B: (t’, x’)z = (0, 0) 


T vT 
D: (1, x)p = ( , ) 4 
V1—v2 V1— v2 (4) 
For example, x/, = 2222 — 2. since xp = 0 and tp = T. Mr. Prime sees the time 
P D a/1—v2 a/1—v2 p p 
interval 
T 

th _— te = >T (5) 


V1— v2 
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between a tick and a tock on Ms. Unprime’s clock. This interval is larger than the interval 
T between a tick and a tock on his clock. Thus, we have shown that an observer sees a 
moving clock as running slow, an effect known as time dilation. The clock at rest in the 
observers frame has tocked before the moving clock tocks. 


Each accuses the other of having a clock that runs slow 


Now Confusio looks agitated. He cries out, “Wait a second here! But to Ms. Unprime, 
Mr. Prime is the one zipping by with his clock. The very principle of relativity states that 
either observer could claim to be at rest. How can two observers, each accusing the other 
of having a clock that runs slow, both be right? This flies in the face not only of common 
sense, but of logic itself!” 

Ah, generations of students have run afoul of this point, a common confusion that 
has generated a seemingly endless stream of crackpot claims that Einstein must be 
wrong. “You mean to say that Ms. Unprime says that Mr. Prime’s clock runs slow and 
Mr. Prime says that Ms. Unprime’s clock runs slow, and yet both of them are absolutely 
correct?!” 

To see that there is no contradiction, first note that a quicker way of deriving time 
dilation is to simply difference the first equation in (1) and set Ax = 0, thus concluding that 
At’ = At//1— v2 inagreement with (5). But now suppose we difference the first equation 
in (2) and set Ax’ = 0, thus obtaining Art = At’/V1 — v2. Compare this with what we hada 
moment ago. The two conclusions, Ar’ = At//1— v? in one case, and At = At’ /V/1— v2 
in the other, do not contradict each other, because one is derived with Ax = 0 and the other 
with Ax’ = 0. Logic still stands. 

The lesson is simply that when we difference or differentiate we have to specify what 
we hold fixed. 


When does Mr. Prime’s clock tock? 


Confusio looked convinced but still puzzled. We gently advised him to go read this chapter 
(up to this point) again. He came back the next day saying, “In the first section of this 
chapter, you said to make a list of all the relevant events. Then in the second section, you 
listed two events: event B when Ms. Unprime’s clock ticks, and event D when her clock 
tocks. What about Mr. Prime’s clock? When does it tick and tock?” 

“Aha!” We all cried in unison, including Mr. Prime, Ms. Unprime, and Confusio. 

Confusio grumbled, “There ought to be two other events: event B’ when Mr. Prime’s 
clock ticks, and event D’ when his clock tocks. You didn’t talk about those.” 

Mr. Prime said, “I can always set my clock to tick when Ms. Unprime’s ticks, so that 
B=B’.” 

In other words, B and B’ are one and the same event. There are actually 3 events to 
reckon with: B, D, and D’. The key question is then “When does Mr. Prime’s clock tock?” 
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So in addition to (3) and (4), we should also write 
B(=B): (t’, x) pep = (0, 0) 
D:(t', xp =(T, 0) (6) 
Watch the primes and absence of primes like a hawk! The fact that the same T appears in 
(3) and (6) is part of the clock manufacturer’s warranty. 
Now, all we have to do is to plug (6) into (2) to find the spacetime locations of events 
B’(= B) and D’ in the unprimed frame: 


B(=B): (t, x)y—p = (0, 0) 


T —vT 
D: (t, x)p = ( ; ) 7 
x V1—v2 V1— v2 ”) 
Thus, Ms. Unprime sees the time interval 
T 
tp’ — ty = aq > T (8) 


between a tick and a tock on Mr. Prime’s clock. 

Compare and contrast (5) and (8). 

Confusio exclaims, “I see! The confusion that befuddled, and befuddles, generations of 
students is really a case of bad notation alert! The notations Ar, Az’, and so forth correspond 
to time differences for different pairs of events.” 


Birth and death of particles 


In everyday life, v < 1, so the effect of time dilation is minimal, but in high energy physics, 
particles typically move almost at light speed, and time dilation is of central importance 
to experimentalists. For instance, a cosmic ray particle (a proton, for example) may crash 
into a nucleus in a photographic emulsion, thus producing an unstable particle moving at 
speed v and disintegrating at a distance L downstream. What is the particle’s lifetime T 
in its rest frame, that is, its intrinsic lifetime? 

The two relevant events are B, the birth of the particle, and D, its death. Let the lab frame 
the experimentalist is sitting in be primed, and the rest frame of the particle be unprimed. 
The preceding analysis immediately applies. (Now you see why I used the letters B and D 
earlier in (3) and (4).) 

In its own rest frame, the particle did not go anywhere: it died where it was born. 
That’s what “at rest” means! Therefore, Ax =0, and so from (1), we obtain the time 


elapsed between birth and death of a fast moving particle as seen by the experimentalist 
fin =t,-h=— 
lab D B View 
lifetime T. The distance traversed (in the lab frame, of course) is given by L = Ax’ = 
ul 


.As v > 1, Tigh can become much larger than the particle’s intrinsic 


. Knowing L and v, the experimentalist can solve to obtain the intrinsic lifetime 


a/ 1-2 
T=V1-v2L/v. 


Time dilation facilitated the measurement of particle lifetimes in the early days of 
particle physics: fast moving particles can live for quite a while in the lab frame, so that the 
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track of a decaying high energy particle can be recorded in some medium. Experimentalists 
could literally measure L with a ruler. That there is indeed time dilation has been verified 
experimentally countless numbers of times. 

A faster way of solving this problem is to use the invariance of the proper time interval 
as discussed in the preceding chapter: (tp — tg)? — (Xp — xg)” = (th — th)? — Gp — x5), 
namely [(T, 0) — (0, - =T* =[(Tab, VTiap) — (0, 0)? = 7,2,(1 — v’), and so we obtain, 
once again, Tigh = ver As indicated in figure 1, the line joining B and D looks longer 
in (1b) than in (1a), but because of the minus sign in the Minkowski metric, the two lines 
actually have the same “length.” 


Lorentz-Fitzgerald length contraction 


After the discussion of time dilation, it is easy to understand that with two rulers moving 
relative to each other, each could see the other as having become shorter. Again, let’s go 
slowly. Consider an observer watching a ruler going by. Let us call the rest frame of the 
ruler the unprimed frame. 

The back end of the ruler traces out the worldline (t, x)p = (t, 0), in other words, a line 
parametrized by tg(t) = t, xg = 0. The statement that the length of the ruler is L means 
that the front end traces out (t, x) = (t, L). (See figure 2a.) Here, t is to be thought of 
as a parameter that runs from —oo to oo. r the primed frame, according to (1), these 


Toe Fp and, p= ES faa” 


Again, t is to be thought of as a parameter that runs from —oo to oo. To be utterly pedantic, 
t = Xt (t) — ut 
Ji-v’ B 1-2 


(1) = toh. atid x rt) = +t. Let’s plot (figure 2b) these two ie in the primed 


Vi Vie 


coordinates, just two parallel straight lines both with slope 4 


two worldlines are described by (t’, x’)p = (—== 


let me state that we are here dealing with four functions of t: 4,(1) = 


7 7 =v. 


Figure 2. The back end (B) and the front end (F) of a ruler as seen by two different observers: 


(a) an observer at rest with the ruler and (b) an observer watching the ruler move by. 
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The key point is that as far as Mr. Prime is concerned, the length of the ruler is given 
by x}, — x, evaluated at the same time, that is, for 7, = tj. Graphically, we see that we can 
choose any value of t/, and ff, as long as they are equal. So choose #, = ¥%& = 0, which 


J 1-v2 
gives = —vL and thus x’, = ~$¥e) — /7 —y2L < L. According to Mr. Prime, the ruler 


—y2 
has length less than L: it has connected! 

Again, you can see that, pace the crackpots, there is no logical contradiction with the 
two observers each claiming the other’s ruler has contracted. According to Ms. Unprime, 
the length of the ruler is defined by xp — xp evaluated for tg = tg. For Mr. Prime, the length 
of the ruler is given by x;, — x, evaluated for f, = tj, most certainly not for ty = tp! 

Just as in the time dilation discussion, the result for length contraction can be obtained 
almost instantly by differencing (1) or (2), evaluating Ax with At = 0, or Ax’ with Ar’ = 0. 

Historically, Fitzgerald had the clever idea that if moving rulers are contracted, then we 
could understand the puzzling result of the Michelson-Morley experiment. 


Dueling theorists and the fall of simultaneity 


For our next example, let us go back to the dueling theorists described in the prologue, on 
pages 7-9. We have three events, V = Professor Vicious at the rear of the carriage pushing 
the button to signal that she has finished her calculation, N = Dr. Nasty at the front end of 
the carriage pushing the button, and G = the gong bonging indicating that it has detected 
the arrival of two pulses of photons at the same instant. In the unprimed frame on the 
train, we write down 

Vi(t,x)y = (0, -L) 

N:(¢,x)y = (0, +L) 

G:(t,x)c =(L, 0) (9) 


We have set the length of the carriage to be 2L, with the detector located in the middle. 
Since photons went from N to G, the invariant interval between G and N must vanish 
(and similarly for the invariant interval between G and V). This requirement fixes 1, = L 
(indeed, the invariant interval between G and N is equal to (L — 0)? — (0— L)? = 0). 
Now that we have listed the three events, it is, as 1 said, more or less foolproof to find the 


primed coordinates for these events by simply plugging the unprimed coordinates given 
t+ux ,_ x+vt 


in (9) into the Lorentz transformation in (1): t/ = J 1-02’ a Viv al 
Ve e= ( al 
Ni, x)n = (SS a=) 
Gi, x g= (SS. Fa) i 


We see immediately, with no further ifs and buts, that in this frame, event N occurs after 
V: th — Kh = 2vL/v1— v* > 0. The Swede standing on the platform has no doubt that Dr. 
Nasty does not get to go to Stockholm. 
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Figure 3 The pole in the barn. 


As a check on our calculation, we can compute in the primed frame the velocity of 


: , : ‘ ; : . _ — (4v)L 
light getting from the point V to the point G in spacetime: Ar’ = 1, — K, = ae and 


Ax! = x6 - x = yo giving ~~ = 1. The reader can easily check that in getting from 
point N to point G, the velocity of light equals = = —1. These results of course just reflect 


the invariance of the interval (At’)? — (Ax’)? = 0 = (At) — (Ax)?. 


Pole in the barn 


Another apparent paradox that has confounded generations of students is the traditional 
pole in the barn problem, often given on exams. 

One fine day, Ms. Unprime is possessed by a maniacal desire to run at almost light speed 
carrying a pole of length L toward a barn of length L. (See figure 3.) The farmer who owns 
the barn, Mr. Prime, calmly watches. To him, the pole has contracted and thus should fit 
easily inside the barn. But to Ms. Unprime, the barn, rushing toward her, has contracted 
alarmingly. 

To dramatize the story further, suppose that Mr. Prime, also afflicted by some mental 
disorder, closes the front door of his barn while leaving the back door open. He has rigged 
things up with fancy electronics so that the front door won’t fling open until the instant 
the back end of the pole gets inside the barn—that is, as soon as it passes the back door. 
For good measure, although not actually necessary for the narrative, at that instant, the 
back door will slam shut. Will there be a crash, or will Ms. Unprime sail right through? 

Most importantly, stay calm. (That is, you the exam taker.) Write down the 4 relevant 
worldlines, that of the back end of the pole (Pb), the front end of the pole (Pf), the back 
door of the barn (Bb), and the front door of the barn (Bf): 


(t, X) pp = (t, 9), (t, X)pp = (t, L) (11) 
and 
(t’, x) pp = (t’, 0), (t’, x pe = (t’, L) (12) 


Notice that we have written down these two pairs of parallel lines in their respective rest 
frames (figure 4a,b). Plugging (11) into (1), we obtain 
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t t’ 
Pb Pf Bb Bf 
fe) i ie fe) L = 
(a) (b) 
t t" 
BbA Pb Bt 
Pb Bf front end PF 
of pole— d her 
Bb exits barn re 
barn opens 
e—> xX > x’ 
O V1T-vL \ O VWi-vL 
L 
front end 
of pole~ |\\ VI-v’ 
exits barn _ front door 
of barn 
opens 
(c) (d) 
Figure 4 The pole in the barn as seen in spacetime. 
t ut t+vb L+ut 
(t', x")pp= (S 5) > Cx pe= ( , ) 13 
. V1—v2 V1—v2 ’ V1—v2 V1— v2 3) 
Plugging (12) into (2), we obtain 
t —vt' t'—-vL L-vt' 
(t, x) -(4. 5). (t, x) =( é ) 14 
ei V1—v2 V1— v2 - V1l—v2 V1—v2 (4) 


(Is this plodding enough for you? Yes, plodding but foolproof.) 

Note that in (14), as ¢’ varies, (t, x)p, and (t, x)pe trace out two parallel lines with 
slope Ax/At = —v as indicated in figure 4c. (This provides a slight check against copying 
errors.) To construct the figure, we have to figure out where the line Bf intersects the 
x-axis. Simply set fpe in (14) to 0, thus giving t’ = vL. Plugging that into (14), we obtain 
Xpe (at tae = 0) = ran = VJ1—v2L < L. Weare of course just rediscovering the Lorentz- 


Fitzgerald length contraction. 
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Similarly, in (13), as ¢ varies, (t’, x’) pp and (¢’, x’)p¢ trace out two parallel lines with slope 
Ax'/At' = v as indicated in figure 4d. You can verify that the line Pf intersects the x’-axis 
at xpe(at the =0) = J1—wL <L. 

Now we go from worldlines to events. An important event in the story is when the back 
end of the pole (Pb) reaches the back door of the barn (Bb). Recall that at that instant, 
Mr. Prime flings open the front door of the barn. 

You should be able to see right off when that occurs, but let us be plodding. Set 


Hen aa: as 


that we set coordinates equal in the same frame of course, ee (t, X) pp = (t, X) pp. The 
and 0 = = 0. Thus, in 


from (14). It is important to note 


resulting two equations, namely ¢ = = 
fact, this event occurs at the origin in both the primed add teeta frame, which reflects 
our wisdom in setting our coordinate systems. (Did you figure that out without solving the 
equations like a plodder? You can see it from figure* 4c,d.) 


At that instant, the front door of the barn opens. For Ms. Unprime, this occurs at 


a —vL 
(t, x)p¢(when front door of barn opens) = ( ree zs): Note that tge(when front 
aes of barn opens)<0. More importantly, xpe(when front door of barn opens) = 


Joa > L, and she sails right through! (See figure 4c.) 


It is interesting to check the spacetime location of another important event, the be- 
ginning of the pole’s exit from the barn, namely when the front end of the pole (Pf) 


passes through the front door of the barn (Bf). In the unprimed coordinates, this occurs 
t-vL  _L-vt' 
J 1-v2 i J 1-v2 
t—vL=J/1—v%t and L— vt! = J/1— v2L, we obtain t = (/1— v2 — 1)L/v < 0 and 
x = L. Similarly, in the primed frame, the exit occurs at r’ = (1— /1— v2)L/v > 0 and 
x’ = L. See figure 4c,d. 


when (t, x)p¢ = (t, L) is equal to (t, x) pp = ( ). Solving the two equations 


Uncommon sense in, uncommon sense out 


As I was finishing this book, I happened to have dinner with a distinguished condensed 
matter physicist. He mentioned that he was teaching a course on special relativity and 
that he didn’t like the textbook, because it gave the students the impression that special 
relativity consisted of a series of paradoxes. I couldn’t agree with him more. I told him that 
I was writing a textbook on special relativity and I had limited the number of paradoxes. In 
my opinion, the pedagogically correct way of presenting special relativity is to emphasize, 
as I tried to do in the preceding chapter, the geometry of spacetime, a point of view that 
generalizes naturally to the curved spacetime of general relativity. 


* By the way, for the sake of clarity, I did not superpose the images in figure 4a-d. This is why this 
figure has four parts, a, b, c, and d. Also, notice that figure 2a,b (describing length contraction) is contained 
in figure 4c,d. 
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The paradoxes* are of course helpful in making students understand the subtle is- 
sues behind the Lorentz transformation, and they played an important clarifying role 
historically. But once we accept that the speed of light is a universal constant, a state- 
ment that blatantly contradicts the velocity addition law of Newtonian physics, we can 
expect to encounter situations that defy the common sense built on our everyday expe- 
riences. As already mentioned in the prologue, we could say, paraphrasing computer 
scientists, uncommon sense in, uncommon sense out. All these apparent paradoxes con- 
tradict our Newtonian intuition, but they could not possibly contradict logic, as I have 
emphasized here. 

These puzzles are best regarded as refreshing reminders of how counterintuitive special 
relativity actually is. In fact, they play almost no role in actual research in high energy 
physics. Lorentz invariance is actually built into the grammar of the language used in 
high energy theory, namely quantum field theory, from the start. 


Causality and temporal ordering 


Although simultaneity fails, causality still holds, as I emphasized in the preceding chapter. 
Particles have to propagate inside their future light cones. In particular, temporal ordering 
cannot depend on the observer. 

More explicitly, consider a particle, either a material particle or a particle of light, 
propagating from event A to event B, that is, event B is in the causal future of event A. 
In other words, At = tp — ta > 0 and |Ax| = |xp — x,| < At. Is it possible to reverse the 
temporal ordering by going to another frame? 

We simply difference (1): At! = (At + vAx)//1— v2. Since v? < 1, Ar’ cannot go neg- 
ative. Temporal ordering is maintained. 

But perhaps this discussion suggests to you how temporal ordering might be reversed 
under certain circumstances. What are these circumstances? See if you can figure it out! 

Well, just by eyeball, we can see from At’ = (At + vAx)//1— v? that we can make Ar’ 
vanish by choosing v = —At/Ax. But v* = (At/Ax)? has to be less than 1, which is only 
possible if (Ar)? < (Ax)* < (Ax)* + (Ay)* + (Az)?, that is, if the two events A and B are 
spacelike with respect to each other. By continuity, if we can choose a v that makes Ar’ 


vanish, we can have a v that makes Ar’ go negative. 

So, it is possible to reverse the temporal ordering of two events if they are spacelike with 
respect to each other, which means precisely that they could not affect each other causally. 
There is still sanity within the craziness, so to speak. 

Next, what does a reversed time ordering imply for physics? The answer is in the 
appendix. 


* In high school math, you shouldn't spend all your time doing trick problems in a puzzle book. It’s more 
important to grasp the general principles. 
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Special relativity and starting your car 


No doubt, time dilation and the creation of antiparticles (see the appendix) at high energy 
accelerators are plenty stunning, but still, it would appear that special relativity is totally 
remote from everyday life. Not so! In 2011, a mere 106 years after Einstein’s paper, it 
was discovered! that relativistic effects account for about 1.8 volts out of the 2.1 volts 
produced by the common lead-acid battery. This is because the lead nucleus is so massive 
that the motion of the electrons around it is highly relativistic. So, the next time you hear 
a car start up somewhere, you can mutter to yourself, “Ah, the Lorentz transformation 
again!” 


»” 


“From the very first line | am stopped by ‘signs 


On March 31, 1922, the new global celebrity Albert Einstein made a triumphant visit to 
Paris, greeted by headlines like “Time Does Not Exist, Says Einstein.” A huge crowd tried 
to get in to hear his lecture. Among those caught up in the excitement was Marcel Proust, 
the famous author of the masterwork A la recherche du temps perdu [In Search of Lost Time], 
with its wistful message that the passage of time was merely an illusion. Indeed, Proust 
even thought of time as a dimension analogous to space. He wrote? to a physicist friend: 


How I would love to speak to you about Einstein! Although it has indeed been written to me 
that I derive from him, or he from me, I do not understand a single word of his theories, 
not knowing algebra. And I doubt for my part that he has read my novels. It seems we have 
analogous ways of deforming Time. But I cannot figure it out for myself, because it is me, and 
we don’t know each other, nor can I do so for him because he is a great mind in sciences that 


I am ignorant of, and from the very first line I am stopped by “signs” that I don’t recognize. 


Appendix: Appearance of antimatter 


It is possible to skip this appendix upon a first reading, as subsequent material will not depend on it. The 
discussion in this chapter, indeed in most of this textbook, is entirely classical. But as was mentioned in the 
introductory chapters, our real world is fundamentally quantum. The only knowledge of the quantum world I 
require of you for the following discussion is Heisenberg’s uncertainty principle. According to Heisenberg, due 
to fundamental limitations on what we can know about the microscopic world, we cannot measure all observables 
to arbitrary accuracy. As a result, it is possible to have spacelike propagation |Ax| At, but only for a very short 
time limited? by the mass of the particle. In figure 5a, we show an actual physical process in which (left half of 
the diagram) a proton turns into a neutron by emitting a pion (known as the +), which by charge conservation 
necessarily carries positive charge. This is event A. In event B, the 2* is absorbed (right half of the diagram) by 
a neighboring neutron, which as a result turns into a proton. This process generates an attraction between the 
proton and the neutron, thus accounting for the strong or nuclear force. 

Spacelike propagation is allowed, as Heisenberg said, only for a short time and over a small distance given by 
the inverse of the pion mass (in natural units). Indeed, since the range of the nuclear force was known, Hideki 
Yukawa was able to predict, in 1935, the mass of the hitherto unknown pion. We cannot go into further details 
here. Instead, let us ask what an observer zipping by would see. 
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Figure 5 The need for antimatter. (a) A proton turns into a neutron by emitting a 
positively charged pion, which is subsequently absorbed by a neighboring neutron, 
which as a result turns into a proton. (b) The same process as seen by a different 
observer: a neutron turns into a proton by emitting a negatively charged pion, which 
is subsequently absorbed by a neighboring proton, which as a result turns into a 
neutron. (The time axis is along the vertical direction.) This figure, which describes 
a physical process in spacetime, is known as a Feynman diagram. 


Since the pion propagates over a spacelike interval, it is possible for this observer to see a temporal ordering 
in which event A occurs after event B, as was explained in the text. Thus, she would see (figure 5b) the neutron 
turning into a proton (right half of the diagram) by emitting something (event B). But by charge conservation, 
this something necessarily has to carry a negative charge! 

In other words, given the z*, theoretical physics has predicted a negatively charged pion with exactly the same 
mass, known as the ~. To finish the story, the 2~ is then absorbed (event A) by the neighboring proton, which 
as a result turns into a neutron. 

The z~ is the antiparticle of the z*. In essence, this is the kind of physical reasoning that led Dirac to predict 
the existence of antimatter in 1928. I hope to give you, by this brief heuristic argument, some flavor of what 
might happen when you marry quantum mechanics to special relativity. In our story, the key is electric charge 
conservation. In the microscopic world, several other charges are also conserved, and thus a particle and its 
antiparticle necessarily carry opposite charges. 

Incidentally, you can see that the discussion here is a generalization of the Vicious versus Nasty story in the 
prologue. 


Notes 


1. R. Ahuja, A. Blomqvist, P. Larsson, P. Pyykk, and P. Zaleski-Ejgierd, “Relativity and the Lead-Acid Battery,” 
Phys. Rev. Lett. 106 (2011) 018301. 

2. T. Damour, Once Upon Einstein, p. 34. 

3. The uncertainty principle states that AE At ~ h. Here AE > m, whichimplies At < hi/m. For details, consult 
a textbook on quantum field theory, such as QFT Nut. 


The Worldline Action and the Unification of 
Material Particles with Light 


A child’s way of calculating square roots 


Imagine yourself a bright young theoretical physicist toward the end of the 19th century. 
You felt annoyed about how light and material particles were treated differently. You 
admired Fermat’s least time principle for light beams, so elegantly stated. 

Such simplicity, light in a hurry! In contrast, look at the Euler-Lagrange action for 
material particles 


s= far in (2) - vw) (1) 


Clunky in comparison. 

Then you put this thought aside and went on with your day-to-day research—you did 
need to get tenure. In spite of what people like deans say, day-to-day research was what led 
to all the good stuff, not the very best stuff, but the good stuff, in academic life. One day, 
while sitting in your office daydreaming, you remembered the first time you learned about 
the concept ofa square root. You learned that the square root of 25 is 5, of 36 is 6, and so on. 
But soon, since you were one of those smart kids who grew up to be theoretical physicists, 
you wondered about the square root of a number that was not obviously the square of an 
integer. What is the square root of 24, for example? You used the time honored method of 
trial and error. So you multiplied 4.9 by itself, 4.8 by itself, and so forth. Pretty soon you 
could guesstimate square roots quite well. Some time later, you learned about the brilliant 
idea* of representing numbers by letters. The formula you wanted turned out to be 


2 


ag ee ae (2) 


2a 


* In 820, while working in the House of Wisdom in Bagdad, the Persian Al-Khwarizmi, whose name gave us 
the words “algorithm”! and “logarithm,” proposed a method of calculation he called al-jabr. 


208 | III. Space and Time Unified 
which you verified by squaring the right hand side (a — |); =@—_e+ = the error in 
(2) is of higher order. 

A few days after your daydream, you look at the action for material particles again. Even 
without the external potential V (x), the action still does not look like it could be simplified 
to look like anything as elegant as Fermat’s least time principle. You do what A. Zee in 
chapter III.1 of his gravity book said the calculus textbook he had in high school told him 
never to do: think of derivatives as fractions. You do precisely that and cancel off one power 
of dt: 


3>\ 2 >\2 

dx (dx) 
S= | dtim{—) =} 3 
/ im (&) am dt @) 


Now you are offended by the totally different ways in which space and time are treated. 


The undemocratic treatment of dx and dt irritates your liberal ideal deeply. Yes, the dean 
keeps reminding the physics faculty that every subject on campus is equally worthy of 
respect. That dx appears squared in (3) more or less follows from rotational invariance, 
(Ax) 

At? 


but why does dt deserve only one power? What a strange combination, the square 
of something divided by something else! 

Then your subconscious nudges you: you have seen this combination before, the square 
of something divided by something else! Oh dear reader, where but where have you seen 


this before? 


Speak, memory 


p 2 
Aha! You rewrite (3) as = ~ —Va2 — e2+a+---.Inother words, 
2a 


S<2 SA 2 
CO) fo OOO sie REAR? tect (4) 
2At 2cAt 


To get the dimensions to come out right, you are forced to introduce a constant c with the 
dimension of a speed. What could it be? 

You realize that the only speed around with any intrinsic significance is the speed of 
light. Notice that this c literally muscles its way in for dimensional reasons. We didn’t go 
looking for the speed of light, the speed of light came looking for us. So you write the 
action for a point particle as 


s=—me | (cdt)2 — (aa + me? f at 


The relativistic point particle action 


The term mc? { dt = mc?(tgnal — finitial) in the action S treats dt and dx differently, and 
thus would negate your entire philosophy. But fortunately, its variation vanishes, since the 
initial and final times are fixed. Hence, this term in the action does not contribute to the 
equation of motion, and so the action principle allows you to drop this offending term. 
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All looks great then, and so you quickly publish the action S = —mc f /(cdt)? — (dx), 
which in units with c = 1 reads 


S=-m / dt? — (dx)? (5) 


This action, compared to (3), treats space and time much more democratically. 

It didn’t happen this way in our civilization, but it could have happened this way in some 
other civilization far far away. I like writing alternative physics history. 

Of course, what actually happened was that your paper was rejected by a succession of 
referees. One of them told you that you should brush up on your calculus. Didn’t you know 
that integrals are supposed to have the form f dt of some function of t? This guy is easy 
to take care of: you write 


sam fanfi-(4) (6) 


To show another referee that you can reproduce Newtonian mechanics, you expand (6) 
and rearrange slightly to obtain 


s=far{* (2) ma} (7) 


Heavens to Betsy, you even get that most famous ; in physics history, as in sm v?, which 


in hindsight has been whispering “Square root square root” for centuries. 


The one formula even the person in the street knows 


You don’t know howto incorporate a potential; just adding —[ dt V (x) to the action (5), as in 
(1), would again favor dt. But if you had a potential as in (1), then in the nonrelativistic limit 
of (7), m would just be added to the potential V (x). You submit another paper interpreting 
the extra term m as a rather peculiar kind of potential energy. 

Even a particle just sitting there has energy, in fact an enormous amount of energy 
compared with the kinetic energy smu" it could acquire in everyday life, with v < c. Let’s 


restore c and repeat the derivation: S = —mc f /(cdt)? — dx* =—mce f dt,/c* — (fy = 
—m f dt{c? — 1(dz)2 +--J=f dt{}mv? — mc? +---}. The referee snarled that this au- 
thor didn’t even know that an additive constant in V(x) mattered not a whit. Paper rejected. 

Phew! It’s a good thing that you got tenure first before pursuing this stuff. Meanwhile, 
in a civilization in another galaxy far far away, a man named Albert discovered what is 
probably the most famous equation of all time? 


E=me (8) 


More likely that the proverbial guy in the street has heard of this equation than of F = ma! 

At first sight, it may seem strange that the two terms in S, with opposite signs, both 
correspond to energy. Hamilton was clever enough to resolve this apparent problem. 
Recall from chapter I1.3 that for the Lagrangian L(q, q), the Hamiltonian is given by 
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2 


H(p, 4) = pq — L(q, q) with p= oe Hence, if we were given L = 5mq? — mc“, we would 


have p = mq, thus giving H = pq — (5mq? — mc’) = = + mc?, which is exactly right for 
the energy of the system: kinetic energy plus mc*. Well, all we’re saying is that mc? should 
be counted as part of the potential energy. 


A symmetry pops out 


Meanwhile, a mathematician friend of yours—could have been James Joseph Sylvester, 
a rather astute fellow, since he demanded that his salary from Johns Hopkins University 
be paid in gold before accepting its invitation to move from England to a scientifically 
impoverished but economically upstart country called the United States—told you about 
some fancy-pants math called matrix theory. He pointed out that if you defined a 4-by- 
4 diagonal matrix n,,, with (—1, +1, +1, +1) along the diagonal, you could write the 
combination dt* — dx? in your action (5) more compactly as —7 uvdx"dx”, defining x° =. 
At first you dismissed this as mere notational dressing, but after studying this matrix 
theory, you realized that your action did not change under the linear transformation 


dx > AM dx® (9) 


provided that n,,, A” AY, = Nop: a matrix equation to be solved for A. Nature has a “hidden” 
symmetry that extends and generalizes rotation! Of course, by this point, you did not even 
dream of publishing your discovery any more. You just showed it to Sylvester, who thought 
it might be somehow related to invariant theory, later developed into group theory. 

Ah, the joy of hindsight!? If you were around earlier and felt the ugliness* of the 
Newtonian action (1), you too could have discovered special relativity, and perhaps even 
the group theory of linear transformations! 


Action, geometry, and mass as a conversion factor 


The action for a relativistic point particle has the elegant form 


S= -m f [~Nyyaxtdx® (10) 


Recognizing n,,,dx“dx" as the Minkowskian distance squared between neighboring 
points, we see that the particle’s action is (up to an overall constant) the distance it has 
travelled in spacetime. An appealingly geometric* picture! 

Indeed, if we were told to construct the action for a point particle, the only coordinate 
invariant quantity we have available is the “length” or proper time duration of the worldline 


* Invite yourself at this point to generalize this action to that for a relativistic string. See, for example, QFT 
Nut, chapter IV.4 and a later section in this chapter. 


Ill.5. The Worldline Action and the Unification of Material Particles with Light | 211 
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Figure 1 The action for a point particle. The only coordinate 
invariant quantity we have available is the “length” or proper 
time duration of the worldline traced out by the particle. 


traced out by the particle (figure 1), namely f dt = [ \/—n,,,dx"dx". We call the propor- 
tionality factor the mass of the particle. If you like, that provides one cool definition, a 
rather profound one at that, of mass: mass is the conversion factor between geometry (the 
length of the worldline) and physics (the action). 

Bad notation alert! In fact, it is the same alert as in chapter II.1. The symbol x” refers 
to the spacetime coordinates of the particle traversing spacetime, not spacetime itself. 
Again, this is laid bare by considering the case of many particles labeled by an index 
a with S=— 0, mg f \/—Nyydxadx’. The bad notation is best avoided by denoting 
the spacetime coordinates of the point particle by X” (as we already did in passing in 
chapter III.3): here, unlike in chapter II.1, g’ would be nonstandard and a bit pedantic. 

Let us now go back to a single particle merely for ease of writing; you could add the 
summation sign if you want. With ¢ any parameter that varies monotonically along the 
worldline so that we can write X“(C) as a function of ¢, we can write the action as 


s=—m ff nypaxvaxe =—m faz tga = | ace (11) 


The length of the worldline, being a geometric quantity, is manifestly reparametrization 
invariant, that is, independent of our choice of ¢ as long as it is reasonable. As already 
remarked in chapter II.2, this is one of those “more obvious than obvious” facts, since 
I /—nyvdX" dX" is manifestly independent of ¢. Indeed, everything is the same as in 
chapter II.2, except that here the metric signature is spacetime rather than space, that is, 
Minkowskian rather than Euclidean. 

The Lagrangian has a rather odd-looking square root form (just as in chapter II.2) 


1 
KH dxX’\2 
dX" dX ) (12) 


L=-m 

( ede de 
but that does not stop us from sticking the Euler-Lagrange variation to it. Since L is 
independent of X, we obtain the equation of motion 


d 6L d (J a) 


baey (eal Ee 13 
dé \ 54 Mat (13) 
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To simplify (13), we exploit the freedom in choosing ¢ and set dé to dt, so that L = —m. 
Then (13) becomes 

d*xt 

——_ =0 14 

qe (14) 

as we already learned in chapter III.3 (and in some sense even in chapter II.2). 

Indeed, the geometrical action (11) was already foreshadowed by the triangle shown in 
figure III.3.1 and discussed in chapter III.3. 


Unification of material particles with light 


At the beginning of this chapter, you felt annoyed about how light and material particles 
were treated differently. Now you can actually do something about it. The nonrelativistic 
action (1) is manifestly incapable of describing a photon, the ultrarelativistic particle of 
light. The relativistic action (11), in contrast, might yet have a fighting chance. 

At first glance, things do not look good, since the action (11) § = —m f d¢,/—nyy ue oF 


does not make sense for a massless particle. To remedy this, consider another action: 


x 1 dx? m? 


dX" dx” 
Nuvde dt" 
This action looks strange at first sight: not only does it depend on the spacetime trajectory 


where (as always) (& 2 = 


X(¢) of our point particle, it also contains another dynamical variable o(¢). Notice, 


however, that a does not appear in the action. Thus, the Euler-Lagrange equation for 


o(¢), namely +(45) — os = 0, collapses to 8 = 0: 
rs 


og \ dé 
This is an algebraic equation, not a differential equation, for o(¢). In other words, the 
dynamical variable o(¢) does not have dynamics of its own but rather is totally yoked? to 
XH(0), ; _ 
In fact, if we use (16) to eliminate o in S, we recover S. Thus, the two actions, S and S, 


are equivalent in the sense that they yield the same equation of motion for the particle. 
We can easily verify this equivalence explicitly. Varying S with respect to X*, we obtain 


i (3) = 4 (ona i) =0, since § does not depend on X explicitly. (Compare this 
dt 

with (13).) Using the equation of motion for o to eliminate it from this equation of motion 

for X, we obtain 


d m dX" 


Nur 
dt (Ge dt 
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As we have done many times by now, we use our freedom in choosing ¢ and set dé todrt, 


which is defined by (4*)? = 1. Then this simplifies to ax" = 0, and we recover (14). 


You might say that the introduction of o is a cheap’ trick to get rid of the square root in 


the action S in (11). Indeed, (16) is trying to tell us that o~! is a proxy for the square root. 


Massless particles 


After this slightly long, although straightforward, manipulation, you might have lost sight 
of what we want to achieve. Let me remind you: we would like to have a unified treatment of 
massive and massless particles. Newtonian physics cannot deal with massless particles at 
all. It hardly makes sense to set m = 0 in F = ma. Nor is the action S in (11) up to the task. 

Now, ta dah, setting m = 0 in S, we obtain an action 

dX" aX 

Smmassless = 5 / dg (on ) (17) 

which makes perfect sense. Varying with respect to 0 (¢) now gives 


NyvdX"dX” = 0 (18) 


for a massless particle, or in other words, (d xX )? = (dX°)?. We recover what we have always 
known, that massless particles travel at the speed of light. 


To be massless in the contemporary world 


To me, one truly profound intellectual triumph of special relativity, with far-reaching 
impact on contemporary particle physics, is the notion of a massless particle. To appreciate 
how mysterious and alien this concept is, try explaining it to an intelligent person who 
happens not to bea physicist. One problem is the definition of mass in elementary physics 
texts: typically, mass is said to be the amount of substance contained in the object. You 
mean something without substance can still have energy and momentum? 

But we could perfectly well set m = 0 in the Einsteinian E = \/ p? + m2, in contrast to 
the Newtonian Ex = Z, In contemporary particle physics,’ all the known particles—the 
various quarks, the electron, the electron neutrino, and their various cousins; the photon 
and its various cousins responsible for the strong and weak interactions; and the graviton, 
responsible for gravity—all of them start out in life* massless. (They acquire masses only 
later, through a phenomenon known as spontaneous symmetry breaking that does not 
concern us here.) 

Before 1905 and special relativity, there was no need for a massless particle. Light was 
known to be a wave. But 1905, Einstein’s annus mirabilis, was also the year Einstein came 
up with the Nobel Prize-winning idea of light as consisting of photons. 


* More precisely, the quantum fields corresponding to all known particles appear in the action as massless 


fields. 
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Figure 2 A Babylonian tablet (drawn in a modern pictorial representation). 


Babylonian tablet 


There is a Babylonian tablet from about 6,000 years ago on which figure 2 was inscribed.* 
Can you figure out what it says? 

Take the small square of side b and replace it by a rectangle of equal area, of sides a and 
b*/a. Cut the rectangle into two equal smaller rectangles, and paste them onto the sides, 
kitty-corner, of the large square of side a. The author of the tablet was trying to tell you that 
the result is almost a square of side a + las 

Here is the algebraic translation of the Babylonian tablet 


b2 
/az+b2r~a+—++.--- (19) 
2a 


Clever, no? That guy would have surely gotten the Fields Medal had it existed. That tablet 
blew me away when I saw it. I wondered whether that Babylonian could have thought, in 
his wildest imagination, that his discovery also held the secret of space and time. If so, he 
deserved the Nobel Prize in addition to the Fields Medal. 


* What figure 2 shows is of course my attempt at copying the tablet, not the original tablet. 


Ill.5. The Worldline Action and the Unification of Material Particles with Light | 215 


Appendix 1: The preferred parameter choice for massless particles 


As before, if we use the parameter ¢, somebody else could use ¢’ as long as the two parameters are re- 
lated by a smooth monotonic function. We couldn’t have lost the reparametrization invariance enjoyed by 
the action S by going to an equivalent form. To verify this, change variable ¢ > ¢’ in S and write S= 


5 f dt! 5 (oon Ere 4 a). Upon defining o(¢) = o'(6') 55, we recover the same form (15) of Sas 


expected. 

So then, is there a “best” choice for ¢? For massive particles, we know the answer: proper time. But what is 
the best choice for a massless particle, for which proper time has no meaning? 

When we vary S with respect to X“, we obtain ia (ony a) = 0. Multiply this by o and go to a parameter 


¢’ defined by o(¢) = $B. Then this equation of motion becomes simply oe = 
equation of motion take on this simple form is known as an affine parameter. (I must confess that I have always 
disliked this wishy-washy word “affine’—none of the strength of character of a word such as “entropy,” for 
example.) 

Let’s see what this somewhat formal discussion is all about in the case of a photon propagating along, say the 
x direction. We don’t have to solve any equation of motion to know that X“ is proportional to (1, 1, 0, 0). End of 
story. No need to parametrize this worldline going across Minkowski spacetime at 45°. The equation (18) stands 
on its own merits: it does not need a parameter. 


0. A parameter that makes the 


But if you insist, you could write X“ = f(¢)(1, 1, 0, 0) with some monotonic function f. Then a = 
(f"/f)X". To make life easier, we should obviously choose f to be a linear function of ¢. I explain all this 
seemingly useless stuff here, because we will encounter similar considerations in curved spacetime. To me, the 
pages some texts spend on the affine parametrization of massless particles literally amounts to much ado about 


nothing. 


Appendix 2: Baby string theory 


The take-home message is that the action principle together with symmetry makes for a powerful combination. 
Indeed, so powerful that it enables us to generalize the action for a relativistic particle almost immediately to the 
action for a relativistic string.’ The reader seeing all this for the first time might wish to skip this appendix. 

The location of a point particle in d-dimensional spacetime is given by X"(t), with w= 0, 1,---,d—1, and 
that is that. In contrast, the location of a string in d-dimensional spacetime is given by X“(t, 0), where o isa 
parameter telling us where we are along the length of the string. For an open string, o starts at 0 on one end 
of the string and ends at some g,, at the other end. Thus, X“(r, 0) and X(t, o,,.) give the locations of the two 
ends. For a closed string, o is conventionally taken to range between 0 and 27, with the two ends identified: 
X“(r, 0) = X"(z, 2m). 

As t varies, X(t) sweeps out a worldline. In precisely the same way, the location X"(t, o) of a string sweeps 
out a world sheet, an open sheet (figure 3a) with a boundary for the open string, and a cylindrical tube (figure 
3b) for the closed string. 

You learned way back in chapter 1.6 how to embed a curved surface in a higher dimensional space. Now is 
the time to put your knowledge to good use! Denote* o° = 1,1 =o. Then o” (a = 0, 1) provides a coordinate 
system on the surface swept out by the string. Recall from chapter I.6 that a metric is induced on the surface, in 
this context by the ambient Minkowski metric Nv: Gop = Nyv9qX"dgX", where I,X" = ox ; 

In the case of the point particle, the only quantity with intrinsic geometric significance is the length of the 
worldline. For the string, the corresponding quantity is the area of the world sheet. But you know from chapter 
1.5 how to write down the area of a surface, namely f d’a J det G, where det G denotes the determinant of the 
2-by-2 matrix Gag and d 2@ = dtdo. The basic action for string theory 


SNambu-Goto = 7 i! dtdovV det G (20) 


* Sometimes o® is called x%, but we want to avoid confusion with the x“ appearing earlier in this chapter. 
Tt To avoid confusion, I call the induced metric G, not g as in chapter I.6. 
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Figure 3 A string sweeps out a world sheet: (a) an open sheet with a 
boundary for an open string and (b) a cylindrical tube for a closed string. 


was first proposed by Nambu and Goto. The overall constant T, with dimensions of 1/ (length)? or mass/length, 
may be interpreted as the string tension (as in chapter II.1). 
We can readily verify that if we change the coordinates o% on the world sheet, we leave the string action invari- 
ant. Under a “world sheet” reparametrization 0% > o’“(c), the integration measure changes by @o =do'J," 
aX" dX” da” da’? 
Nv Io¥ IoF do® ok = 


G’ (s)L(sD§, so that det G = J~2det G’. Thus, finally f doaVJdet G = f d?o'/det G’, and the action is in- 


ye 
deed geometrical. 


where the Jacobian J is the determinant of the 2-by-2 matrix s 4 ier. Meanwhile, Gog 


Exercises 


1 Ina precise parallel with the discussion for the point particle, we can avoid the square root in the Nambu-Goto 
action and instead use the action 


j= gr f acdoy by @,X"9pX,) (21) 


with yyg (= det y,g) an auxiliary variable playing the same role as the auxiliary variable in (15). Show that 5 
is equivalent classically to Syambu-Goto- 


2 Show that the string action is invariant under an arbitrary local rescaling 


2m (t,o) 


Yap(T, 7) > e& Yop(t, o) 


known as a Weyl transformation. As a result, the equations of motion determine y,g only up to this rescaling. 


Notes 


1. No, Al Gore did not invent “algorithm.” 

2. In 1908, Einstein wrote to Johannes Stark complaining that the latter did not properly acknowledge his 
priority in deriving E = mc?. Stark wrote back, and Einstein was appropriately apologetic, writing “People 
who have been privileged to contribute something to the advancement of science should not let such things 
becloud their joy over the fruits of common endeavor.” By the way, Stark later called Werner Heisenberg “a 
white Jew” for defending Einstein’s theory of relativity. 

3. Psychologists have quantified this phenomenon of feeling that in hindsight things always seem so easy 
(B. Fischoff, J. Exp. Psych. 3 (1977), p. 349). 
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4. The ugliness can be quantified by comparing the symmetry groups of nonrelativistic and relativistic physics, 


oo 


the Galilean group versus the Lorentz group. See F. Dyson, “Missed Opportunities,” Bull. Am. Math. Soc. 78, 
(1972), p. 635. 

. In quantum field theory, something like o (¢) is known as an auxiliary field. 

. Albeit one used quite often in quantum field theory and string theory! 

. [remember vividly arguing with my professor S. B. Treiman when I was an undergraduate. He had told 
me about the soft photon theorems in particle physics, which describe the interaction of the photon with 
charged particles in the limit that the photon’s energy and momentum go to zero. I thought that the photon 
ceased to exist, but he reminded me that the photon’s spin was still there. I simply could not understand 
how something with no mass, no energy, no momentum, and no quantum number could still be spinning. 

. J. Polchinski, String Theory, vol. 1. 


Completion, Promotion, and the Nature of 
the Gravitational Field 


Natural and unnatural quantities 


We now know that nonrelativistic physics is but an approximation to a deeper truth. Since 
physics is Lorentz invariant, all physical quantities have to transform in a well-defined 
fashion under the Lorentz group, namely according to some definite representation of 
the group. The 3-vector dx has to be unified with the 3-scalar dt to form a 4-vector 
dx" = (dt, dx). Neither dx nor dt is relativistically complete. As Minkowski foresaw, 
space by itself, and time by itself, have now “faded away into mere shadows,” at least 
in fundamental physics. 

In this chapter, we will study how various quantities in nonrelativistic physics are to be 
“completed and promoted.” Doing this, we will also discover the nature of the gravitational 
field and be one step closer to the main subject of this book. 


Completion and promotion 


Consider the 3-velocity ty = ae What an awkward quantity, a 3-vector dx divided by a 3- 
scalar dt that happens to be the “time component” of a 4-vector. Ugh! A 3-vector divided 
by a 3-scalar, with neither transforming nicely under the Lorentz group. Gimme a break, 
no way the resulting object vy is going to transform nicely! 

In contrast, consider a 3-vector dx divided by a Lorentz scalar, namely dt (defined as al- 
ways by dt” = —n,,,dxdx" = dt? — dx’). Now we are talking: the resulting object dx can 
be contained in a 4-vector. Indeed, let us define v = as a 3-vector whose relativistic com- 
pletion is naturally the 4-vector v4 = a (In other words, & consists of the v' components 
of the 4-vector v.) The velocity v” is the right thing to study. 

If you insist, as some authors do, on writing relativistic quantities in terms of “non- 
relativistic quantities,” you can do it (it’s a free country), but it’s best not to do so. For 


example, since 44 = —“_ = —_1___ we can write 


dt Jara /1-(# 
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dx (+ a2) 
dt 


Bie: = 1 ty 
dt dt dt j1-%} j1-%, 
(with vi, = Uy - Uy) and other awkward looking relations such as = v°dy. It may be 
convenient on occasions, but it sure ain’t natural. 
In theoretical physics, you are free to define any quantity you want, but two criteria are 
of prime concern: (1) whether the quantity you define is useful in the particular situation 
you are considering, and (2) whether the quantity you define is conceptually natural and 


thus serves to deepen, rather than confound, your understanding. 
dx 
dt 
Some of the confusion over special relativity stems from confounding the two different 


velocities. Note that Jy, not ¥, is the velocity that appears in the factor ,/1—v%. To 
compound the confusion, people often omit the subscript N, a standard practice that we 


Our discussion makes clear that vy =“ and v= dx represent different quantities. 


will also indulge in. Note also that while v4, < 1, the quantity v? ranges from 0 to oo. Never 
mind what the subscript N stands for (but if you insist, you could try “Newtonian”). (In 
this connection, some authors also adopt for vy the unfortunate notation vyp with NR 
standing for “nonrelativistic,” even though vy p can get up to light speed.) 


Laws to be promoted 


Concepts (such as vj) appropriate for space and time should be promoted to concepts 
(such as v“) appropriate for spacetime. Correspondingly, the laws of physics have to be 
promoted also. Toward the end of chapter I.3 on rotation, I explained why physicists insist 
that physical quantities must transform “nicely.” (The reader new to this might want to 
reread what I said there.) The niceness is not so much an aesthetic nicety but rather the 
fundamental requirement that physical laws should not depend on the observer. Physics 
must be independent of physicists! 

Consider the law of momentum conservation. Multiplying v“ by the mass, we have 
the 4-momentum p” = mv" = mae of a particle of mass m moving with 4-velocity v". 
The conservation of nonrelativistic 3-momentum py = my in nonrelativistic physics 
strongly suggests that the 4-momentum p/ = moe is also conserved. (Ultimately, this 
statement has to be verified by empirical measurement rather than by philosophical 
pronouncements, of course.) 

In particle physics, two particles of momentum p, and p, collide to produce a bunch of 
particles (could be two or could be two hundred, all with different masses). Momentum 
conservation states that )? (<7) P! = iwer) Pi. Here the index a labels the different 
particles, and the subscripts on the sums, (a € J) and (a € F), instruct us to sum over all 
the particles in the initial and final states, respectively. This is just the direct generalization 
of the familiar }°(4¢1) PNa = (ae F) PNa- Now let us define K" = Yo gey) P! — Diaery PH 
so that momentum conservation states K“ = 0. The whole pointis that since K’“ = A". K”, 
if K* =0, then K'“ =0. 
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If Ms. Unprime has momentum conservation, then Mr. Prime better have momentum 
conservation also. This is the reason momentum must transform like a 4-vector in a 
Lorentz invariant world. We are not saying anything conceptually different from what we 
said at the end of chapter I.3. (Note also that this discussion does not depend in any way on 
the details of particle physics except that particles can be produced in collisions.) Indeed, 
we had already used this argument in chapter III.3 when we “guessed” what the equation 
of motion of a free particle must look like: one requirement was that Ms. Unprime and 
Mr. Prime subscribe to the same equation, which amounts to saying that dx transforms 
like a 4-vector. (Notice that in this chapter, we are slipping back into the bad notation of 
using x" for the location of the particle.) 

But now comes the important point, indicating that we have gone past rotational invari- 
ance. The power of Lorentz invariance is such that you cannot conserve p! without also 
conserving p°, since they transform into linear combinations of each other. 


Let us expand p® = m4 = —“— to find out what it is. Restoring c, we have p® = 


dt 1-v2 

N 
mc? + 4mvy + O(v4,). The big surprise (as we have already seen in the preceding chapter) 
is of course that the Newtonian kinetic energy smu is not the leading term in the 


expansion of p° but the second term. Even a particle at rest has energy! 


The most famous formula in physics 


What we have discovered is that p® is the energy of a particle of mass m moving with 4- 
velocity v“, not exactly the energy we knew, but the energy we knew plus mc? (and an 


infinite series besides)! So we officially give p° another name, namely E = p”. 
A “real” relativistic physicist should of course not write the nonrelativistic formula* 


E = mc. You might tell your lay friends that the pros write E = mc? as 


—p? =m? (1) 
since we have p* = n,, pp” = —(p°)? + (p)* = m>(n,,, G2) = —m 
tion of dt” in the last step. We should think of (1) as a constraint on p, a constraint known 


2 


, using the defini- 


as the mass shell or on shell condition, because in the 4-dimensional space spanned by 


p" = (p®, p), (1) restricts the momentum to a hyperbolic shell defined by (p°)? — (p)? = 


m?. 


* Einstein’s famous paper! contained the result 


Lv? 
Ky-k,=37> 
v2 2 
What! It doesn’t look like E = mc? to you? Einstein is telling us that when an object moving at velocity v radiates, 
its kinetic energy K changes by (in modern notation) 5K = af oe (In his paper, L denotes the energy emitted 
in radiation and V the speed of light.) See appendix 1. He then goes on to say, a couple of paragraphs later, “It is 
not excluded that it will prove possible to test this theory using bodies whose energy content is variable to a high 
degree (e.g., radium salts).” 
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Itis worth emphasizing that the relation (1) holds in any frame for any particle, including 
those for which m = 0. The same is not true of the more famous E = mc’. 

Strictly speaking, the conservation of p“ (and hence of p°) must be regarded as a the- 
oretical suggestion, albeit a very strong suggestion, to be verified by careful experimental 
measurements. Indeed, so far we have only noninteracting free particles. At this stage, we 
do not even know how to include interactions between particles. The nonrelativistic expe- 
dient of introducing a potential V (x; — x2) does not work, since it is not Lorentz invariant.* 
Here the power of Noether’s theorem shines through. Without having to specify the in- 
teraction, we know that p” conservation will follow as long as the interaction is invariant 
under spacetime translation; that is, it does not depend on any specific point in spacetime, 
which certainly seems reasonable. Recall from chapter I1.4 that in nonrelativistic physics, 
conservation of 3-momentum and of energy follow from invariance under translation in 
space and in time, respectively. It is pleasing that these two translation invariances are now 
unified into a single invariance, just as space and time are unified into spacetime. 

Logically, nothing in our discussion says that mass can change and the enormous 
amount of energy locked up in the rest energy mc* can be released. As you know, that 
is indeed possible. But input from atomic, nuclear, and particle physics is needed to tell 
us how and when mass can change. 


Relativistic kinematics 


To illustrate what 4-momentum conservation (henceforth, just momentum conservation) 
can do, let us go through Compton scattering, in which a photon? of momentum k hits 
an electron at rest and goes off with momentum k’. See figure 1 depicting this process 
in the lab frame. Let us find the frequency ’ of the outgoing photon as a function of the 
scattering angle 6 in the lab frame as measured from the direction of the incoming photon. 

Momentum conservation gives k+ p=k'+ p’, with p the initial and p’ the final 
momentum of the electron. Note that we are always talking about 4-momentum unless 
otherwise stated. The desired quantity can be extracted from the relativistic invariant k’ - p, 
which is equal to —mo’ when evaluated in the lab frame in which p = (m, 0). (Notice that 
we do not simply say that k’ - p is equal to —mw’. We must specify the frame, since w” is not 
a Lorentz invariant quantity.) We have k’- p =k’: (k’+ p’ —k) =k’. p’—k’-k. Inthe first 
equality, we used momentum conservation; in the second, we used the fact that the photon 
is massless to set k’? = 0 invoking (1). Fromk + p =k’ + p’, it follows that (k + p)* = (k’ + 
p’)?. Since (k + p)? = 2k - p — m? and (k’ + p’)? = 2k’ - p! — m?, we obtain k - p=k’- p’. 
Combining this with the previous relation, we find k’ - p=k- p — k’ -k. Evaluating this in 


* To introduce interactions correctly, we have to use fields, as we will see. Classical field theory suffices; no 
need for quantum field theory yet. 

T In Einstein’s other great 1905 paper, he introduced the concept of the photon and showed that its energy and 
3-momentum is given by E = hw and p = ik, respectively. We are using natural units with h = 1 and denoting 
the photon 4-momentum by k = (a, b). 
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Figure 1 Compton scattering. A photon of 
momentum k hits an electron at rest and 
goes off with momentum X’ at an angle 0. 


the lab frame, we obtain ma’ = maw — (a’w — ki. k) = mw — wa'(1— cos 8), and thus? 
; o 


oe = —_____—_ 
1+ 2(1-—cos@) 


m 


(2) 


Note the downward shift of the photon frequency as a function of scattering angle. (Rel- 
ativistic kinematics alone cannot tell you the range of angles the photon prefers to come 
out in, namely the differential cross section.)? 

This example provides a prototype for how such problems, typical of particle physics 
and astrophysics, should be approached. First, write the quantity you want to determine in 
terms of some Lorentz invariant. Then use the equations you have, namely the mass shell 
condition for each particle and momentum conservation. It is best to keep the calculation 
Lorentz invariant until the last stage, at which point you can evaluate the result in any 
frame you like. 


The relativistic Doppler shift again 


The preceding discussion gives an alternative derivation of the relativistic Doppler shift. 
Suppose a particle of momentum p emits a photon of momentum k, which is then 
absorbed by a particle of momentum p’. 

To obtain the frequency shift, our strategy is to evaluate the Lorentz scalar k - p’ in two 
frames and demand that the results agree. In the rest frame of the emitting particle, p = 


(m, 0), k=(o, k), and p’= m'(she 45). where v is the velocity of the absorbing 


particle in the rest frame of the emitting particle, so that* k - p’ =—m'(w — 0 - k)/V1— v2. 
(Note that we do not assume that the emitting particle and the absorbing particle necessar- 


ily have the same mass.) However, in the rest frame of the absorbing particle, p’ = (m’, 0) 
and k = (w’, k’), where ’ is the frequency of the photon seen by the absorbing particle, we 
have k - p’ = —m'w’. Since the scalar k - p’ has the same value in all frames, we equate these 
two expressions for k - p’ to obtain w! = (w — 0 - k)//1— v2. Writing t - k = @|i| cos 6, we 


* Note that here v is the concept previously known as vy. I got tired of writing the clarifying subscript. A 
simple rule (as has already been mentioned): the v inside the famous square root V1 — v? is ty. 
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see that this agrees with our result in ([II.3.9). With the convention here, 6 = 0 means that 
the receiver is receding and hence sees a redshift. 

The particles in this “particle physics” derivation could be replaced by observers of 
course. Let each observer “carry” a 4-vector U, which has the form U“ = (1, 0) in his or 
her rest frame. Then simply replace p and p’ in the derivation by U and U’, respectively, 
since the masses m and m’ are not relevant to this discussion. We will encounter U“ again 
later in this chapter. 


Currents 


Our next example of relativistic completion is a bit more involved. Consider the number 
density (of atoms, molecules, particles, or any objects which cannot disappear into thin 
air) n(t, x). This quantity, the number of particles per unit volume, is a rotational scalar. 
Think of a bunch of particles sitting inside a box. Rotations change neither the volume of 
the box nor the number of particles inside. 

Now the question: Does n(t, x) relativistically complete into a Lorentz scalar or some- 
thing else? In general, there is no algorithm such that you can simply turn the crank and 
answer this question. To obtain the answer, you need to exercise some physical insight 
or mathematical savvy. For this simple example, I will give a physical argument, to be 
followed by a more formal mathematical analysis. 

To an observer moving by, the box is Lorentz contracted (figure 2) in the direction of 
motion by /1 — 0? and thus its volume is diminished by this factor. (Incidentally, as already 
mentioned earlier, the v? that appears in this square root factor stands for v%,; we will 
henceforth suppress the subscript N.) Since the number of particles inside is unchanged, 
we conclude that the number density as seen by this observer is larger than the number 
density seen by an observer at rest with the box by 1/./1 — v2. In other words, n(t, x) 
transforms like the time component of a 4-vector. 

Physically, it is obvious what the other 3 components are. This observer sees the particles 
moving and thus observes a current density. The most naive guess that the rotational 


Figure 2 (a) A box containing a certain number of 
particles inside. (b) To an observer moving by, the 
box is Lorentz contracted, and thus the number 
density as seen by this observer is larger than the 
number density seen by an observer at rest with 
the box. 
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scalar n(x) is promoted to a Lorentz scalar is wrong. Rather, it is promoted to be a 

component of a Lorentz vector n(x) = (n(x), n'(x)), a 4-current. Nothing strange or 

unusual here: after all, energy, a rotational scalar, is promoted to be a component of the 

momentum p". As a check, apply the transformation law for a Lorentz vector. To the 

observer at rest with the box, n“(x) = (n, 0). Thus, to the observer watching the box go by 
0 n?+|d|n 


figure 2) along the 1-axis say, Lorentz transformation gives n’’ = “—"“ = —4_— > nand 
Ree ae ” 8 Vt fie 


SiO = 4 . : eA os has 

n= ici = = > n|v|, as we argued physically. The inequalities indicate Lorentz 
1-3 1-3 

contraction. 


We next give a more mathematical treatment, reaching the same conclusion in the end. 
For simplicity, suppose we have only one particle sitting at the origin. Then n(t, x) = 
6°) (¥), with the 3-dimensional delta function introduced in chapter II.1. In other words, 
n(t, x) is concentrated at the origin, vanishing everywhere else and unchanging in 
time. Check to make sure that we have one particle: [ d*x n(t, X) = f dx 69 (x) = 
fdx f dy [ dz 8(x)8(y)8(z) = 13 = 1as expected. More importantly, this shows that 6 (x) 
is rotationally invariant,* since d+x is rotationally invariant. Our challenge is now to write 
n(t, X) = 6°)(x) ina relativistic form. 

Introduce the worldline of the particle traced out by g(t). For a particle just sitting 
there, g°(t) = t, g(t) =0, and dt? = —Nyvdq'dq’. (Now is a good time to recall the bad 
notation alerts I sounded repeatedly in part I concerning the distinction between x and q, 
between spacetime and some particle’s location. In other words, x is where you are and q 
is where the particle is. Here the word “when” is subsumed into “where” (in spacetime) as 
per Minkowski.) Write 5 (x) = f dtd(x° — q°(r))6 (x) = f dt 2 5(x° — g°(t))59 (¥ — 
q(t)) = f dt Le 6“ (x — q(t)). Here we have introduced the 4-dimensional delta function 
6 (x) = 8(x°)5 (X), which (in analogy with the discussion for 5) (x) being a rotational 
scalar) we argue is a Lorentz scalar, since f d‘x64(x) =1 and d‘*x is Lorentz invariant 
(recall chapter III.3). 

For the abecedarian struggling to follow this, it may seem a totally pointless academic 
exercise in which we make things progressively more complicated. For example, in the 
second equality, we rewrote 1 as ae and introduced 0 = q(t). But of course there is 
a point; otherwise this wouldn't appear in a textbook. The point is that the expression 
we ended up with is manifestly the time component of the Lorentz 4-vector n“(x) = 
fd i 5) (x — q(t)). Everything on the right hand side is a Lorentz scalar except for 
the 4-vector ag" Indeed, by Lorentz, this expression, now that we have it in this form, 
holds for any worldline g(r), not just the simple form we had above. Furthermore, we 
can sum over an arbitrary number of particles. See figure 3. 

To summarize, we define the number current (4-current, to be pedantic) as 


“ dGe 54) 
n => f ares (x — qq(Tq)) (3) 


* You have essentially shown this already in exercise I.3.2. 
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Figure 3 Defining the number current as a 4-vector in spacetime; 
the dashed line indicates a moment in time. 


with d _ = —n,,»dqidq?. It transforms correctly and reduces in the appropriate limit to 
something with vanishing space components and a time component equal to what we 
would call a number density. What more could you want? 

In chapter I.4, I explained that an irreducible representation of a group, upon restriction 
toa subgroup, will become reducible and break up in general into a direct sum of anumber 
of smaller representations of the subgroup. Thus, the 4-dimensional representation of 
the Lorentz group SO(3, 1), upon restriction to the rotation group SO(3), breaks up as 
4— 3+ 1, a 3-vector and a 3-scalar. Here we are asking the reverse: given a rotational 
scalar (what we also call a 3-scalar), which representation of SO (3, 1) does it come from? 
In general, there is no unique answer. We have to appeal to physical considerations as is 
done here. 


Current conservation 


Another nice feature is that the conservation of the number of particles we started out 


with, n(t, x)= £59) (x) = 0, is instantly generalized to 


ant an? ani 0 S's 
ge 0 (4) 
ox or ox! ot 


known as the continuity equation. You no doubt encountered this when you first learned 
what a divergence V - fi was. You were probably taught to draw a little cube and to add up 
the number current flowing in and out of the six faces of the cube. Under rotations, dgn® 
and @;n' both transform as 3-scalars, but under Lorentz transformations, they are revealed 
as parts of the same package 0,,n". 
Admire the power of Lorentz invariance: it mandates that mn must be promoted to d,,n". 
Nevertheless, it may also be instructive to verify (4) laboriously. Acting with 9, on (3), 
a 


we encounter 3,5 (x — qq(Ta)) = — Pe (x — qq(T)) and hence the integral 


dy 9 gy aC (4) Tq=00 
dt, SOK — qy(ta)) =— | dty——8 (= Gg(tq)) = 8 (& = d(T) 5 = 0 
dt, dqu dT, a 


a 


0 


since* q° (00) = +00, while t = x° is some fixed instant in time. 
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Figure 4 Counting the number of particles inside some 3-volume 
V at some time ft. Here dV indicates the boundary of V. 


Another instructive exercise is to integrate n°(t, ¥) over some 3-volume V at some time 
t (figure 4). We encounter 5(t — q°(t,)) fy d?x6® (x — G,(t,)), with the integral giving 1 
if the particle is inside V at that time and 0 if not. Note the time delta function “slices” the 
worldline and fixes t, to be that value at which q°(t,) =. Thus, 


dq? 


[ cans / dtq de, Ot %ata)) = > i dg°5(t—q2)= SY) 1=Ny(0) 


(aeV) (aeV) (aeV) 


giving precisely the number of particles inside V. You may find this all quite involved, but 
it is actually just telling you the obvious. Furthermore, applying (4), we have £Ny(t) = 
f, Px Enx) =— fy dx ou =— f,,, dS; n' (using the divergence theorem): the change 
in Ny (t) with time is of course given by integrating the number current n' flowing through 
the surface element dS; forming the surface dV enclosing V. 

We went through the number current in detail to save work later. Indeed, we can now 
write down the relativistic form of the electromagnetic current without further ado by 


simply including the charge e, carried by the ath particle: 


dq" 
Po= > 4 / dtq—* 5 (x — qa(Ta)) (5) 
a dTq 
By the same considerations as above, we obtain the conservation of the electromagnetic 
current 
a,¢% =0 (6) 


Energy momentum tensor 


Moving on from the number density, we can now easily study energy density. As before, 
consider a bunch of particles sitting in a box. As is obvious by now, in nonrelativistic 
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physics, energy density, which we denote by p(t, x), is a rotational scalar, since energy 
and the volume of the box are both rotationally invariant. How does it transform under a 
Lorentz transformation? 

We simply invoke our physical argument again. Our moving observer not only sees the 
box contracted, but he also sees the particles moving; thus p(t, x) is enhanced by not one, 
but two factors of 1/1 — v2. From what we learned in chapter III.3, it transforms like 
the time-time component T(x) of a 4-tensor T“”(x), known as the energy momentum 
tensor (also called the stress energy tensor by some). In other words, this observer sees 


2 
7/00 — NG AY ONE (4) T (since for the unprimed observer the only 
=U: 


nonvanishing component of T“” is T®). 

Just as for the number current (3), for which we count the particles, and for the electro- 
magnetic current (5), for which we add up the charges, here we tally the 4-momentum p* 
carried by each jase We write down instantly the energy momentum tensor 


t dq 
r= fans » dutta) =D fa “ne oF Be) me (7) 


We already know that we need a tensor. Gratifyingly, “ 


dt, 7 , 
manifestly a tensor. The second form in (7) emphasizes that T(x) is a symmetric tensor 


and thus possesses =2 aa = 10 independent components. 
We now have the ae momentum conservation law (exercise 3) 


8,7" Gy=0 (8) 


in parallel with the current conservation law (6). Just as before when we counted the 
number of particles in a volume V, the amount of energy momentum contained in the 
volume V is given by 


[ox te y= Y fans t pa(x® = Gita) = D> ph = Py(t) (9) 


(aeV) (aeV) 


In the second equality, we again use dt, a = dq to convert the integral over t, into 
an integral over q° to knock off the remaining 5 function. (As they say, the first time a 
philosopher, the second time a connoisseur. In physics, you are a world expert the second 
time you use a trick.) We obtain the total 4momentum P},(t) contained inside the 3-volume 
V at time ¢. 


Stress 


Again, to the beginner, it may seem at first sight a bit odd that we would take something like 
T°’ (x), which carries an explicit time index, and integrate it over the Lorentz noninvariant 
3-volume. But d3x is precisely looking for something that transforms like dt to “complete 
itself.” 
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Confusio nods: “Yes, I can see from (9) that T(x) = (T(x), T%(x)) describes the 
spatial densities of energy and momentum, but I have a harder time picturing T’/ (x).” 
Well, Confusio, you are not alone. Most people have trouble. It might help to note that 


(as before for the number current) we have 4 PY (t) = fi, d 3, OF) =— fy dx ar = 
— fay aS; T’”. Set v = j. Then 
“Pi =- f ds, Ti (10) 
dt av 


tells us that the time rate of change of the 3-momentum pe (t) (hence a force) has to 
do with T’ acting on the surface element dS;. In other words, T’! is a force per unit 
area, and hence must be a pressure pushing in the jth direction, exerted on an area 
element pointing in the ith direction. Note that since T’/ = T/', this pressure is the same 
as the pressure pushing in the ith direction, exerted on an area element pointing in the 
jth direction. This physical picture also underlines the tensorial character of T’/: there 
are two directions involved when you press against a surface. Does that make sense, 
Confusio? 

“Yes indeed, one for the direction of the force, the other for the orientation of the 
surface.” 

Thus, some authors call the energy momentum tensor the stress energy tensor. 

Incidentally, this discussion shows explicitly that upon restriction (recall chapter 1.4) 
to the rotation subgroup, the 10-dimensional representation of the Lorentz group de- 
composes as 10 > 1+ 3+6, namely 74” = {7™, T%, TH}, 


At a specific time 


We could do the integrals in (3), (5), and (7), if our little hearts desire. To do this, I need to 
teach you an identity. Since the delta function is a big spike with total area under the peak 
| oat dx 5(x) = 1, we have (co dx 8(x)s(x) = s(0) for a sufficiently smooth function s(x). 
Also, 


eee arm) 
Tie dx sibxysts) =f dx ib s(x) = bl 


where the factor of 1/b follows from dimensional analysis. (Io see the need for the 
absolute value, simply note that 5(bx) is a positive function. Alternatively, change the 
integration variable from x to y = bx: for b negative, we have to flip the integration limits.) 
A trivial generalization is [ ie dx 8(b(x — a))s(x) = Te. Once we know how to deal with 
a linear function inside the delta function, we can handle any smooth function f (x) inside 
the delta function. Denote the zeroes of f(x) by x, (in other words, f(x4) = 0) and 
write f’(x4) = af (x a). Expand the function f(x) around its zeroes. Break the integral 
f be dx 5(f (x))s(x) into pieces, each containing a zero x. Hence the identity 


oP 5(X4) 
dx 5(f (x))s() = 11 
ie x O(f (x))s(x X if’ Gcayl (11) 
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(I trust you to deal with nongeneric cases, such as what happens if f’(x,) vanishes for 
some A.) 

We now focus on an integral f dt,6(t — g°(Tq))8 (Tq) in (3), singling out the time delta 
function as the one we are knocking offand calling the rest of the integrand s(t,). Applying 
the identity (11), we find that the integral equals s(t,)/|(dq°/dt,)| evaluated at the value 
of t, that solves the equation q°(r,,) =f. Once again, this may seem rather sophisticated 
to the reader seeing it for the first time, but it is simply determining the proper time 
T, of particle a as it crosses the specified time slice t. (See figure 3.) Since worldlines 
cannot go backward, the sum in (11) reduces to just one term. Thus, we encounter 
(dgt /dtq)/(dq°/dt,) = (1, Ue) as defined earlier and (3) breaks up into 


n(x) = D> OR — Galta)) 
n(x) =o vy BOR — Galta)) (12) 


(with 1, defined by q°(t,) =) precisely as we would expect. (Reneging on our earlier 
promise, we put back the subscript N here for emphasis.) 
Similarly, when we do the integral in (7), we encounter p’(t,)(dq"/dt,)/ (dq° /dtq), 


; ; . dq? a 
which we write as p(t.) p?(Tq)/Eq(Tq) (since E, =m te = p? is just another name for 


p?). Thus, 


PY (Tq) Pa (Ta) al es 
pv oe a \"a*Fav "a 4(3) 
ao Ae oe eee (13) 


a 
The denominator E, may seem a bit off to you, but you will soon see that it plays a 
necessary role. 


Perfect fluids and the comoving observer 


Consider a system of many particles. If the spatial separation between particles and the 
mean time between collisions are much less than the length and time scales we are 
interested in, we have a fluid (a term used loosely here to include gases). The various 
currents we have been discussing all become smooth functions of x. At a given point in 
spacetime, the fluid moves with a 4-velocity U“(x) normalized to 


NuvU"U" = UU, =—1 (14) 


For instance, the number current would then be given by n”(x) = n(x)U"(x). 

For an observer going with the flow so to speak, known as a comoving observer, 
U°=1, U'=0, and thus n(x) is just the number density seen by the comoving 
observer. 

If the fluid is isotropic as seen by the comoving observer (that is, the fluid does not have 
a special direction in its local rest frame), it is said to be perfect. 

Since U! = Oand there is no other 3-vector available to construct T™ out of, T™ vanishes, 
by rotational invariance. Furthermore, the only thing we have available to construct the 
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symmetric rotational tensor T’/ is the Kronecker delta 5’). Hence the stress energy tensor of 
a perfect fluid at that point has the form T(x) = p(x), T%(x) =0, and TY (x) = P(x)d/, 
with p and P some function of x. Written out as a matrix, we have 


p 09 0 0 
0 P 0 O 
THY = (15) 
0 0 P O 
0 0 0 P 


Let us invite ourselves to express this in terms of U“ and 7”. As you can quickly verify 
component by component, 


THY (x) = (P(X) + PO))UM(X)U" (x) + P(x)” (16) 


Note that U“ may vary from point to point, of course. As another easy check, note that in 
the comoving frame, with the form (16), the conservation law d,,T“” = 0 reduces to op =0 
for v =O and  =0 for v =i. 


Moving fluids 


What if the fluid is moving, as fluids are wont to do? 

Behold the power of Lorentz invariance. Simply go to the frame of an observer moving 
relative to our comoving observer. To this observer, the perfect fluid is moving. All we have 
to do is Lorentz boost the 4-vector U“ = (1, 0) toU#% = ( u vt ): As we have been 


S102” /1-2 


saying until we are practically hoarse, if a tensor equation like (16) holds in one frame, it 


holds in all frames. Thus, for example, the energy density of a moving fluid is 


P v2P 
T(x) = (p + P)(U®)? P= ers P= ae (17) 
1— v2 1— v2 


Without the power of a symmetry argument, it would be somewhat challenging to work 
out the relativistic corrections embodied in (17). Note that relativistically, the pressure P 
contributes to the energy density T°. I won’t deprive you of the fun of working out the 
other components. (See exercise 4, from the result of which you can also see when T“/ for 
i # j might be nonzero.) As should be clear from our discussion, here p, P, and v can all 
depend on x. 


Two gases for cosmology 


We now go back to (13) and calculate p and P for two important cases. To say that we have 


Le v 
Pa Woda ta) in (13) over the multitude of 


particles. By definition, a perfect fluid means that there is no preferred direction in the 


a fluid means that we are to average the factor 


comoving frame: the 3-vector p,(t,) points in all possible directions, so that p°(r,) p' (t,) 
averages to 0, giving 7” = 0, thus confirming our earlier conclusion using rotational 
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invariance. Next, recall exercise 1.3.5, which showed that p! (t,) p/(t,) becomes 4 p2(t,)5"/ 
when averaged over all directions. Thus, for a perfect fluid, 


p= (x Eq(tq)6 OR — due) 


2. 
_ PA) pe. 2 
Re (x Gp) a (18) 


a 


where the angled brackets indicate averaging over particles (an averaging implied by the 
sum already for a macroscopically large number of particles). Since for each particle, 
E* = p* +m? > p*, we have p > 3P > 0. We can evaluate this result in two extreme cases. 

For a nonrelativistic gas, E =m + a +--+. Statistical mechanics teaches us that the 
quantity called temperature T is twice the energy* possessed by each degree of freedom. 
Each (monoatomic) gas particle has three kinetic degrees of freedom, so that its average 
kinetic energy is just 3T. Thus, using (18), we obtain the ideal gas law p = n(m+ 3T) 


d, dines (2) ww PtP = 28) 7, wehave Pont, or PVE NT i 
and, since (37°05) ~ tm = 32m = 3G = T, we have P=nT, or = in more 


elementary notation. A nice derivation of the equation of state of an ideal gas, no? As we 


have suspected all along, P is in fact the pressure. 
For a highly relativistic gas, E = |p| +---- and so p ~ 3P. In particular, the pressure of 
a photon gas is given by 


P= 3p (19) 


Note from (15) that in this case, the energy momentum tensor is traceless: n,,, 7” = 
—p+3P=0. 

In modern cosmology, the content of the universe is typically treated as a perfect fluid, 
as we will see in chapter VIII.1. 


Nature of the gravitational field 


Our simple intuitive argument involving moving particles in a contracting box (which 
shows that the energy density 7°, a rotational scalar, is promoted to a component of a 2- 
indexed tensor) gives us a tantalizing hint of the nature of the gravitational field. Already in 
chapters II.1 and II.3, lreminded you that in Newtonian gravity, the gravitational potential 
® satisfies Poisson’s equation V2 (x) = 42 Gp (x). 

But if p is promoted, then it appears that ® should also be promoted to the time-time 
component of a 2-indexed tensor, so that the two sides of whatever equation we have in 
Einstein gravity to determine ® in terms of p would transform in the same way. This 
strongly suggests that the gravitational field is a tensor field. We will see in chapter VI.1 
that this guess is correct. Furthermore, V? is clearly not Lorentz invariant and should be 


* We omit the historical conversion factor k between degree and the unit of energy introduced by Boltzmann. 
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promoted to —d? = n"9,,0, = —d? + V2. We will develop this line of thought further in 
chapter IX.5. 


Appendix 1: Einstein’s derivation of the amusing and seductive E = mc? 


Did you know that Einstein didn’t have E = mc? in his first paper on special relativity? This famous relation® 
appeared a few months later in a very brief note. Einstein wrote to a friend excitedly: “One more consequence of 
the paper on electrodynamics has also occurred to me. . . . The argument is amusing and seductive; but for all 
I know the Lord might be laughing over it and leading me around by the nose.” 

As we all know, the Lord did not lead Einstein around by the nose. 

The derivation I gave in this chapter of this relation is “modern” and more or less standard. Here I will describe 
an elegant derivation* given by Einstein in 1946, which surprisingly, is omitted from most textbooks® and so is 
in danger of being forgotten. 

Suppose Ms. Unprime observes an atom of mass M at rest emitting two photons with equal and opposite 
momenta, thus leaving the “daughter atom” at rest. See figure 5a,b. Let Mr. Prime move with velocity —v ina 
direction (call it the x-axis) perpendicular to the direction defined by the motion of the photons. We will take v « c 
so that we can use Newtonian mechanics to describe the motion of the atom before and after the emission. Thus, 
Mr. Prime sees, before the emission, an atom moving along the x-axis with velocity v, and, after the emission, 
the daughter atom moving along the x-axis with velocity v, together with two photons with the x components of 
their velocities equal to v. See figure 5c,d. Given that the speed of light is c, it follows that the two photons move 
away from the x-axis at an angle 6 given by cos 6 = v/c, as indicated in the figure. (We are taking v < c so the 
figure is not to scale.) 

The key ingredient in the argument is that a photon carrying energy E,, has momentum p, = E,,/c, a result 
that goes back in some sense to Einstein’s own Nobel Prize-winning work on the photoelectric effect. (Here is a 
nifty derivation. By dimensional analysis, p,, must be a constant times E,,/c. But Mr. Maxwell already calculated 
the momentum carried by an electromagnetic wave (using the Poynting vector, remember?) to be equal to the 
energy of the wave divided by c. Thus, if we think of the wave as a stampede of photons, we could argue that the 
constant must be 1.) 

Momentum conservation holds trivially for Ms. Unprime. Now watch Mr. Prime impose momentum conserva- 
tion in the x direction. The x component of the photon momentum is Py = cos OE, /c=vE,,/ c?. Somomentum 
conservation requires 


vE, 
Mv =mv+2—— (20) 
c 


Here, just to keep an open mind, we write the mass of the daughter atom as m, which may well be equal to M, 
the mass of the atom before emission. As you know, before Einstein, most physicists would have thought that 
m = M. But now we see immediately that momentum conservation cannot be satisfied unless m < M. Cool, eh? 
From this argument, it already follows that during emission the atom must lose mass. 

Next, Mr. Prime applies energy conservation, as any decent physics student would. Before emission, the atom 
has energy A + 3M v2. Again, to be open minded, we add a constant A, which Mr. Newton did not know about, 
to his kinetic energy. The atom loses mass, and so might end up possessing less of everything. Similarly, we 
suppose that after emission, the atom has energy a + zmv-. Energy conservation now requires 


Ad LM Sa oes 2E, =a4 Lar eh ae (21) 
2 2 2 

where in the last step we used (20). By assumption v < c, so we can drop the v” terms (the energy conservation 
equation is consistent as written, since in the derivation we already have dropped terms suppressed by v/c). We 
thus obtain A — a = (M — m)c’, or in other words, 5E = (5m)c?. Following Einstein, we integrate this to obtain 
E=mce’. 

Nowadays, I could have used the decay of the neutral pion into two photons 2° > y + y instead of the radiative 
emission of an atom and simplified the derivation slightly (since the 7° meson has no daughter, so to speak). 


* T like Einstein’s 1946 derivation much better than his original 1905 derivation, which invoked the Lorentz 
transformation and was unnecessarily complicated. 
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(c) (d) 


Figure 5 Einstein’s nearly forgotten 1946 derivation of E = mc’. 


(a) Ms. Unprime observes an atom of mass M at rest. (b) The atom 
subsequently emits two photons with equal and opposite momenta, 
thus leaving the daughter atom of mass m at rest. (c) Mr. Prime 
observes an atom moving along the x-axis with velocity v. (d) The 
atom subsequently emits two photons. To Mr. Prime, the daughter 
atom continues to move along the x axis with velocity v, and the two 
photons have velocities with x-components equal to v. 


But it would be somewhat unfair, since the possibility of a massive particle disappearing into two poufs of energy 
was hardly conceivable back then. Of course, if we are allowed to use the entire formalism of Lorentz vectors, 
we could simply write down the conservation of 4-momentum p, =k, + q, and evaluate it in the rest frame of 
the pion. 


Appendix 2: Conservation and relativistic fluid dynamics 


This appendix may be omitted upon first reading. Fluid dynamics describes how stuff, energy, and momentum 
flow from place to place as a function of time and is thus governed by the two conservation laws (4) and (8). 
Written out more explicitly, d,,n“ = 0 becomes 


vain) +0(cfs)-* a 


Plugging the perfect fluid form (16) into 0,,T“" = 0, we have 


{8,[(e + P)UM}UY + (0 + P)U"),UY = —n!"9,, P (23) 


It takes some work to massage this into shape. Divide the four equations in (23) into one “time equation,” obtained 
by setting v = 0, and three “space equations,” obtained by setting v = i. Solve the time equation for the quantity in 
the curly brackets, 4,,[(o + P)U“]= a [dpP — (0 + P)U* I, U°, and plug this back into the space equations. We 
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encounter* v! = fon and the differential operator vO = = ++ V, which the reader familiar with elementary 
fluid dynamics will recognize as the convective derivative. (Indeed, we derived it in the appendix to chapter III.1 
using Galilean invariance.) 

After the dust settles, we obtain the relativistic Euler equation 


(+89) 8 GS) (6% +v) (24) 


The reader just alluded to would notice that the usual nonrelativistic Euler equation (also known as the Navier- 
Stokes equation and referred to as such in chapter III.1) (2 +3 - Vji=- VP emerges in the limit v < land 
P <p. 

We have thus far extracted 3 equations, namely (24), out of the 4 equations contained in (23). To extract the 
remaining equation, contract (23) with U, and use 0 = 0,,(U"U,,) = 2(3,,U")U, obtained by differentiating (14). 
We find 


d.[(p + P)U"|=U"d,P “mystery equation” (25) 


which you may or may not recognize. 

What do we do with this? The clever trick is to go back to (4) 4,1” = 0,,(nU") = 0 and write the left side of this 
“mystery equation” as 4,[(¢ + P)U“] = 4,[(2* nu“] = nU“a, (*) = nu“ a, (2) + nPU"A, (4) + U“9,,P. 
Thus, the mystery equation becomes U“d,,(2) + PU“a, (4) =0. 

Still don’t recognize this? We have already exploited our knowledge of statistical mechanics; now we invoke 
thermodynamics. First, notice that * and ; are energy and volume per particle, respectively. Second, recall the 
first law of thermodynamics dE + PdV = TdS, with S the entropy. Define s as the entropy per particle. Since 
U4, is proportional to the convective derivative, we see that the mystery equation tells us that, as the fluid flows 


along, the changes in energy and in volume per particle are related by’ 


a(£)+pa (=)=1a (26) 


In other words, the mystery equation says TU“d,,5 = 0, that is, 


(2 +i-#)s=0 (27) 


We obtain the convective conservation of specific entropy and have shown that the flow is adiabatic. Very 
satisfying! No dissipation in a perfect fluid. 

The set of equations, continuity (22), Euler (24), entropy conservation (27), together with an equation of state 
relating P and p and thus specifying the fluid, allows us to solve for the motion of the fluid. 

When I see elegant relativistic equations, (4) 3,,n = 0 and (8) 4,7” = 0 in this case, split up brutally? into 
their space and time components (22), (24), and (27), I must say that I am reminded of the biblical injunction 
“What therefore God hath joined together, let not man put asunder.” 


Appendix 3: The speed of sound 


We now use the formalism of the preceding appendix to calculate the speed of sound in a static relativistic fluid. 
We will need the result when we discuss cosmology in chapter VIII.3. 

Consider a density wave described byn =i + bn, p= P+ 6p, P= P+6P,s=5+6s,andt= 0 + dv. The 
equilibrium quantities, indicated by an overbar, do not depend on space and time. We have also indicated that 
the fluid velocity vanishes before the density wave comes along. You probably know what to do: simply expand 
the relevant equations in the preceding appendix to first order in the small quantities 5n, 59, 5P, 5s, and Sv. 


* A question for you: is this v or Uy from the second section of this chapter? 
T Yes, I know. In the real world, plenty are put asunder. 
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ar 
says that to leading order 2 5s = 0: the specific entropy does not change, which by the first law of thermodynamics 
(26) implies 759 = (6 + P)én. Define* 


aP 6P 
(8% : 
: dp/s 4p 


First, continuity (22) gives 25n + AV - 50 = 0. Second, Euler (24) tells us that 25% = — sip VSP. Third, (27) 


Note that the derivation given here specifies that oT is to be evaluated at fixed specific entropy. This enables us to 
eliminate 6P = c28p = c(p + P)6n/f in the equation for 5%, so that 238i = ~(c2/m)V6n. Putting this into the 
equation for én, we finally obtain the wave equation 


0 232 
men —cVv*sn =0 (29) 
(Recall that you have encountered the 1-dimensional version of this equation in appendix 2 to chapter I1.3.) For 
example, for a plane wave propagating along the x direction, a particular solution is 5n x sin{k(x — c,t)}, thus 
showing that c, as defined in (28) is in fact the speed of sound. 
Plugging in (19), we find that for a highly relativistic gas 
1 


C= Re (30) 


tea : ; : 92 = ‘ ; cee 
It is instructive to compare the differential operator 5 — c?V? that appears in (29) with the Lorentz invariant 


operator 9? = "9,3, = a? — c*¥? mentioned earlier. We have restored the speed of light c to emphasize that the 
two operators have the same form, with c, playing the role of c, as was already foreshadowed by the discussion 
of the baby string in appendix 3 in chapter II.3. 


Appendix 4: The current in string theory 


This appendix may also be omitted upon first reading. Here we work out the current associated with a point 
particle as it traces out a worldline q(t). It is more or less straightforward to generalize this to an extended 
object like a string. Quite aside from string theory, there is also the possibility that our universe may contain 
what are known as cosmic strings. 

In the preceding chapter, we worked out the string action. Recall that at a given value of t, we also have to 
specify where we are along the string by another parameter o. In other words, the spacetime location of a point 
on the string is specified byt g(r, 7). In contrast to the case of a particle, g“ now depends on two parameters 
tando. 

Think about how to generalize the current J“(x) = f dt dal 54) (x — q(t)) associated with a point particle to 
the current associated with a string. Try not to read ahead immediately. See if you can write it down. 

I now give you hints galore. The current should of course also contain 6 (x — q(t, 0)): no current if you 
are not where the string is. We need to integrate over both t and o. In other words, the current should have the 
form ~ f dtd oM5“ (x — q(t, c)). It remains to identify the mystery factor M, the generalization of dg in the 
particle case. We now have at our disposal 0,g" = ag” and d,q4= hs 

Geometry and symmetry provide the guiding lights once again. As in the discussion of the string action in 
the preceding chapter, we insist that the current must be unchanged if we choose a different parametrization 
t—>t'(t,0),0 >0'(t,c). As I said earlier, the second time around you are already an expert. Since the 
integration measure dtdo = dt'do'J changes by a Jacobian (given as usual by a determinant), as discussed 


* Note that the first subscript s refers to “sound” and the second to “specific entropy.” 

+ To conform to the notation used in this chapter, we use q instead of X. For a certain type of reader, I can 
only offer Ralph Waldo Emerson’s famous dictum: “A foolish consistency is the hobgoblin of little minds, adored 
by little statesmen and philosophers and divines.” I will assume henceforth that the reader is neither a little 
statesman nor a divine. 
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; ; ; s w, ; 
in the preceding chapter, we need M to be a determinant to counteract J. Evidently, the factor a in the particle 


. : : . ({ d,q" 9,q” 
current is generalized to the determinant of the 2-by-2 matrix ( ee ii he ). 
o o 
Thus, the current associated with a string is given by 


v 
IM@)= if dtdo det ( is ) 5 (x — q(t, 0)) (31) 
oq" dog” 
for u,v=0,1,---,d—-1. 
The determinant is antisymmetric under the exchange ws < v: we are led to an antisymmetric tensor current 
J”. Thus, the analog of the electromagnetic potential A,,(x) coupling to the particle current J” has to be an 
antisymmetric tensor field B,,,(x) coupling to the string current J“”. 


Exercises 


1 Asyou probably know, the universe is suffused with a cosmic microwave background. A high-energy charged 
particle (such as an electron or a proton) traversing this background will occasionally hit one of these 
microwave photons, transferring its energy to the photon in what is known as inverse Compton scattering. 
Indeed, observationally, it is often by detecting these high-energy photons that we deduce the presence of 
sources of high-energy electrons. Show that, upon impact by a highly energetic electron (say) of energy E, 
the maximum energy the photon can have is given by w’ ~ ; E , with m the mass of the electron and w 


the energy of the microwave photon. 40E 


2 Consider a process in which two particles go into two particles p, + py > p3+ py. All 4 particles may 
have different masses pe = —m? for a = 1, 2, 3, 4. Apparently, we can form 3 Lorentz invariants, namely 
8 = (pi + p2)? = (p3 + pa)’, t= (pi — p3)? = (pa — pa)’, and u = (py, — pa)” = (pa — p3)?. But we have 
known since childhood that there are only two kinematic variables in 2-to-2 scattering: the total energy and 
the scattering angle. Show that a Lorentz invariant identity connects s, t, and uw. 


3 ‘Verify (8) directly by plugging in (7). 


4 Work out T% and T'/ for a moving perfect fluid. 
(a) Apply the result to a fluid moving in the x direction. Show that T** = ane 
nonrelativistic result T** = P. Also, show that T*” = 0 and T»” = P. Are you surprised? 


(b) More generally, under what circumstances would T/ 4 0 for i 4 j? 


which deviates from the 


5 Upon restriction of the Lorentz group to its rotation subgroup, the symmetric tensor T“” decomposes as 
10 > 1+ 3+ 6, as was shown in the text. Consider the antisymmetric tensor F“” = — F’“. Show that it has 
6 components and that it decomposes as 6 > 3 + 3. 


6 Verify that the string current J” does not depend on the coordinate choice on the world sheet or world tube. 


7 Show that M"" = f d3x(x"#T — xT) describes the angular momentum of the system. 


Notes 


1. A. Einstein, “Does the Inertia of a Body Depend on Its Energy Content?” Ann. Phys. 18 (1905), p. 639. 

2. For the historical importance of this result, see R. Baierlein, Newton to Einstein. 

3. For that you need quantum field theory (for example, QFT Nut, chapter II.8), but way back when, the result 
(2) empirically verified sufficed for a Nobel Prize. Of course, the prize was actually for discovering the effect. 
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. By the same token, ifa worldline terminates, then the number of particles is not conserved at that instant. In 
particle physics, one particle can decay into other particles, for example, in the decay of a negatively charged 
pion into an electron and an antineutrino 2~ > e~ + v. The worldline of the pion would then terminate 
at some point P in spacetime, while the worldlines for the electron and the antineutrino would commence 
there. For the level of discussion in this chapter, this is hardly something the reader needs to be concerned 
about. Note, however, that this notion of worldlines ending and beginning contains the seeds of Feynman 
diagrams. See, for example, QFT Nut. 

. Interestingly, there was a history of speculations in the 19th century concerning the energy contained in 
mass. See J. Stachel’s commentary in A. Einstein, Einstein’s Miraculous Year. 

. A notable exception is R. Baierlein, Newton to Einstein. 1 am grateful to R. Baierlein for providing me the 
original reference: A. Einstein, Technion Yearbook 5 (1946) p. 16. 

. Some technical assumptions, the discussion of which would take us too far afield, have been implicitly made 
here. What we have shown here is that the flow is adiabatic; hence the fluid does not exchange heat as it 
flows. To say that the right hand side of (26) is Tds requires an additional assumption: the flow is quasistatic 
between consecutive states of thermodynamic equilibria, for which entropy is defined. This assumption can 
be justified under some circumstances. 


Recap to Part III 


In the showdown between t’ = t and c’ = c, the former blinked, and lost. 

With tr’ no longer chained to tr, Lorentz found a nicer transformation than the one Galileo 
found, a transformation that was more symmetrical and associated with a better group, 
and hence is more pleasing to the eye. The group consists of “rotations” in a (3 + 1)- 
dimensional spacetime. 

Instead of approaching special relativity as a series of would-be paradoxes, we should 
learn to appreciate the geometry of Minkowskian spacetime in which a straight line 
between two points may be the longest, rather than the shortest, path. The key to the action 
governing the motion of particles in this spacetime was hidden in how Babylonians, and 
smart school children, figure out the square root of a number that is almost, but not quite, 
a perfect square. 

Various physical quantities had to be promoted and completed. Not only did this reveal 
a terrifying secret about the energy locked up in mass, it also gave us a clue about the true 
nature of the gravitational field. 

Incidentally, we might as well drop the prime and simply write c = c. 


| Part IV | IV | Electromagnetism and Gravity 


You Discover Electromagnetism and Gravity! 


A bright young theoretical physicist 


Once again, imagine yourself a bright young theoretical physicist in some civilization far 
far away, perhaps the same guy in chapter III.5 or perhaps a successor who discovered that 
guy’s obscure work. Who knows what the environment is like, perhaps the civilization is 
floating in some molecular cloud, and who knows what the order of physics discoveries 
might be in an environment radically different from ours? Perhaps the speed of light was 
established to be independent of observers in relative uniform motion before electromag- 
netism was understood. 

Every morning (assuming such a phenomenon exists over there), you admire the ele- 
gance of the action 


s=—m ff nyydxras =—m | Jar—ae (1) 


Elegant indeed! But how do you get the particle to interact with the rest of the world? 
You look at the nonrelativistic action 


Sue= fat (im Ge 7) (2) 


for inspiration. The point particle interacts with the world through the external potential 
V(x). How do you include V(x) in the relativistic point particle action (1)? 


Two options: Outside or inside 


One day, you realize that you could put V (x) either outside or inside the square root in (1). 
So, you excitedly write a paper proposing not one, but two, possible actions: 
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Option E: S=— Jo [—Nyyaxtdx” + V(x)dt} (3) 


or 


OptionG: S=-—m ij ne + 2) dt? — dx? (4) 
m 


In option G, you have to expand twice to get back to the nonrelativistic action. First, take 
|dx| < dt (that is, the distance the particle traverses, |dx|, is tiny compared to cdt) so that 


a) y Me ae WV dt 


Second, take the potential energy V to be much smaller than m (that is, the rest energy 
mc’), so that ,/1+ 2% ~1+ ©. Note that in (5), the second term is already much smaller 


than the first term, so we do not have to keep the ~ correction to the second term. Thus, 


sem | {(42)a-Z}= fala (B) vn] (6) 


The action in option G leads to the Newtonian action (2) in the appropriate limits, but with 
a mysterious additive constant —m that does not figure in the equation of motion. Well, 
the real you knows what that is. 

You excitedly submit to a journal, and this time, remarkably, you actually get a perspi- 
cacious referee, who rejects the paper saying that the added term, in both option E and 
option G, is manifestly not Lorentz invariant: dr plays a more privileged role than dx. Boy, 
didn’t think of that! Dumb! 


Symmetry, completion, and promotion 


A clarifying comment about symmetry and invariance: consider the harmonic oscillator, 
that is, (2) with V(x) = —5kx?. It is not translation invariant, but in a trivial way. We have 
simply excluded the much heavier mass that the spring is anchored to. ae that heavy 
mass, we replace (2) by Syr =f dt(gm(S #)y2_ V(x — X)+ ics x? )s or, if we insist 
on focusing on the small mass tied to the spring, by Syp = [ dt(;m md £2 Vy (x)) with 
an external potential Vy (x) = V(x — X) that depends on some naramete? X. Translation 
invariance holds if we transform both x > x +a and X > X +a. 

Any beginning student of physics understands all this. Similarly here, if we think of the 
external potential V(x) as imposed from the outside and fixed, then of course the action 
can never be made invariant. It is understood that we also have to transform V(x). 

You think hard, but confound it! Option E in (3) and option G in (4) are the only two 
possibilities you can think of. Either the added interaction term is inside the square root 
or it is outside. Inside or outside. Where else could it be? Puzzled, you file the manuscript 
away in a drawer. Years later, you decide to come back to it and see if you can make it 
Lorentz invariant. 
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Dear reader, who may or may not be the same person as the you with the two marvelous 
actions, can you see how? The key is the relativistic completion we learned in chapter III.6. 


Electromagnetism and gravity 


You stare at option E in (3) for the longest time, and suddenly it becomes obvious! You 
have to relativistically complete the action. Promote V(x) to be the time component Ao(x) 
of a Lorentz vector field A,,(x), and V(x)drt could be just the first term in A,,(x)dx" = 
Ao(x)dt + A;(x)dx'. You merely have to introduce a vector field A,,(x) into the universe! 
You propose the action 


Option Eimproved: S= ftom [—Nyydxt¥dx” + A,(x)dx"} (7) 


Comparing with (2), we see that we should identify Ayg = —V. When we Lorentz transform 
x, we must Lorentz transform A_,,(x) as well, just as in the example with the spring tied to 
an external massive object. (It is also implicitly understood that the argument x in A,,(x) 
now includes time as well as space.) The expression A,,(x)dx“, the contraction of two 
Lorentz vectors, is manifestly a Lorentz scalar. With this understanding, the action S in (7) 
is Lorentz invariant. 

After this great triumph, you immediately try to relativistically complete option G also. 
Staring at the expression (1+ Vy dt? — dx? inside the square root in (4), you understand 
that the key is democracy between dt and dx. If dt? is multiplied by some function, then 
dx? should be too. Denoting (1+ 2“) by g, you write something like g(x)dt? — g(x)dx?, 
with g and g depending on spacetime. But Lorentz transformations “mix up” g and g. Even 
worse, dt? gets transformed into a linear combination of dt”, dx', and dtdx'. Itwould seem 
that you can’t get away without including dtdx' inside the square root. 

Eventually, you realize that (up to a sign) dt? — dx? started out in life as the Lorentz 
tensor dx“dx” contracted with the Minkowski metric 7,,,, and so the answer has been 
staring you in the face: you must relativistically complete by promoting (1+ a) to be 
the time-time component of a Lorentz tensor g,,,,(x). In other words, you should promote 
Nyvdx"dx” to g,,(x)dx"dx”, that is, promote the fixed numerical matrix 7,,,, to a matrix 
field g,,,(x) varying in spacetime. 

So you introduce the tensor field g,,,,(x) into the universe, and quickly publish another 
action: 


Option G improved: S=—m / af ~ Suv (x)dxtdx” (8) 


You now understand (4) as a mere special case of (8), upon restricting g,,,,(x) to the special 
form go) = —(1+ 2), go; = io = 0, and gj; = 4;;. 

With these three papers, you are bound for the extragalactic version of Stockholm 
for sure! 

Dear reader, you might have realized that the extragalactic version of you has just 
discovered electromagnetism and gravity. A double whammy! If not, read on. 
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Electromagnetism pops out of special relativity 


Let us look at electromagnetism (option E) first and postpone gravity (option G) until 
chapter IV.3. 

Excitedly, you vary the action in (7) to see what you get. As always, we parametrize with 
the proper time: 


dx" dx” dx" 
S=—m f def— nye + [ aed, (9) 


Note that in the second term, A,, is evaluated at the spacetime position x“(r) of the particle. 
In other words, the field A,,(x) pervades spacetime, but the particle only samples the field 
at its particular location. 

The variation of the first term in (9) is easy and was done in chapter III.5, which we 
repeat here for convenience: 


dx dx” dx" dbx" axh 
s(-m fa ee )=m fa nea =—m f ax ue 7 5x? (10) 
T 


dt dt dt dt 


Notice that in the last step, for later convenience, we have renamed a dummy. 
The variation of the second term in (9) is a bit more involved: 


dx! dbx!" x 
»f dra = far {Ac a + [B,A,,(x)dx ee} (11) 


We have to remember that A,,(x) depends on x in order not to miss the 4,A,,(x) term in 
(11). Integrate the first term in (11) by parts: 


dbx" dA, (x) dx” 
[arayco ae fa — axtt=— f drd,A,(s) 8x" (12) 


Putting the two terms together and renaming indices, we find 
dx" dx” 
3 / dt Ay ()—— = / dt(8,Ay — ayA,) 8x" (13) 
The antisymmetric tensor field 
Fyy(x) = 8, A, (x) — 3,4, (0) (14) 


just popped out! Some readers may recognize this as the electromagnetic field; if you don’t, 
once again read on. 
Putting (10) and (13) together, we have 


d*x" dx” 
oe a p “ 
6S = / dt ( MNup 7) bx" + Fy ae Ox ) (15) 
Now define F“ = nF, so that the last term in (15) can be written as FUE 8x. 
Setting the coefficient of n,,,5x° to zero, we obtain the equation of motion 
d?x# dx” 
m =+FU(x 16 
dt? vi) dt (18) 


which has precisely the form given in (III.3.20). 
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Lo and behold, we have discovered the Lorentz force law (more below) for a charged 
particle moving in an electromagnetic field! We have discovered electromagnetism! 


Electromagnetism came looking for us 


We did not go looking for electromagnetism; electromagnetism came looking for us. Some 
readers may need to be reminded that F,,,, represents the electromagnetic field and that 
(16) describes a charged particle in an electromagnetic field. Before showing that, let’s 
do something simpler: we check that we recover Newton's slau in ee nonrelativistic 
limit. Since dt = dx® >> dx/ and (dx°)? — dx* = dr, we have $6 ~1>> | Setting ptoi 
in (16) gives mx ~ Fi mes y= ~ F'‘)(x). Comparing (2) and a, we are Eon that the 


dz 
only nonzero component ee ae is Ag(x) = —V (x), so that ES = Fi) = 0;Ay = —0; V. We 
recover Newton’s second law mes <x _ _VV, hardly surprising given how we went from (2) 


to (7). 
Next, we identify the usual electric E and magnetic B field as follows: 
B=FY=Fy, Bi=FR=F, B=F'=Fy, 
El=P "=-Fy, EP =F! =-Fy, EX = FO =—Fo3 (17) 


We also write B = (B!, B*, B>) and Es (E!, E?, E>). (At this stage in the book, I hardly 
need to remind the reader that while B and E transform like 3-vectors under rotation, they 
are not the spatial components of two 4-vectors. In particular, the index i on B! (and ditto 
on E') is not to be lowered and raised by the standard rules for Lorentz indices. They are 
just convenient labels.) 

To show that (16) indeed reproduces the Lorentz force law, let us write it in terms of the 


: fi A A, 
4momentum p" = m7_: 
dp" dx” 
— = FX (x 18 
Pe = FG) (18) 
: 1 0 2 3 
Look first at the spatial components. Set yz to 1: then dp =—F10 ie +F ade +F Bde : 


(Here we raised the lower index on F“ using the Minkowski po n’” and ae the 


antisymmetry of F“”.) With the aeiMartiGn in (17), we obtain 4 P- = E ae + & *B3 + 


ae B?. Multiply throughout by “ to convert the derivatives with respect to t . derivatives 
with respect to t. (This step is optional and in fact best omitted.) By rotational symmetry, 
we can write this as 


—-=E+3xB (19) 


with v = ae The Lorentz force law just popped out! 

Good. We just found out what the spatial components of (18) say. What about the time 
component? Three guesses. 

The time component py = 0 of (18) tells us that ap. =F HG az, But because p° is just 
the energy E of the particle, we learn, upon Tiuldpiing by 4 and defining v = dx , that 


dE 


-E.3 20 
ae (20) 
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describing how a charged particle in an electric field E gains energy E. It is worth 
emphasizing that (19) and (20) are fully, but not manifestly, relativistic. 

What I have done here is write a bit of an alternative history of physics in a galaxy far 
far away. Imagine that in this civilization, we knew nothing about electromagnetism, but 
somehow, by doing an experiment, our savants discovered to everybody’s astonishment 
that the speed of light is a universal constant independent of the observer. Then by studying 
the addition of velocities, we discovered special relativity and then electromagnetism. This 
was not how it happened in our civilization, but it could have. 


The notion of charge 


Once again, it is time for a bad notation alert! As before, it would be best to denote the 
position of the particle not by the generic x, but by g or X. Let us choose X and write 
Nw —+ dtA Xe (21) 
This makes clear that in (16), the electromagnetic field F(x) is to be evaluated at the 
position of the particle, as was already emphasized in the discussion following (9). 

As always, the point becomes glaringly clear if we think about the case of many particles 
labeled by a = 1, 2,---, N, possibly with different masses m,. Indeed, denote the space- 
time position of particle a by X,. In the generalization of (21), A,,(x) is to be evaluated 
at x = X,(r). In other words, while A,,(x) exists throughout spacetime, in the action, it 
“knows” about particle a only through X,(t). This is what we mean by saying that the 
action is local. Of course, the electromagnetic field could acquire dynamics of its own (as 
we will see in the next chapter), in which case the physical effects of the electromagnetic 
field would be propagated throughout spacetime.” 

So, write the action 


ae af dt Mw G Mae i Ay 


When we do that, we bump up against another important point. We see that if we have 


oe dx’ Xt 


(22) 


a 
a 


more than one particle, then when we add the interaction term between particles and field 
to the action, we can allow, as in (22), each particle to “couple” to the new field A, with a 
different strength! e,, which we will call charge. In contrast, in (21), there is no point in 
introducing e, since it could be absorbed into the definition of A,,. The action produces 
the equations of motion: 


v 


dp a 


(23) 


a Tq 


* Verily, this is why we need fields: we like to have a local action, but at the same time, also physical effects 
that can propagate from one point to another. 
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I prefer to write (22) in a parametrization independent form: 


S= / ie {—may/=nd Xba, + cahyl(XoaXe| (24) 


Relativistic unification 


In our study of physics, we typically encounter the two vector fields E and B in stages 
and are then told that they are components of an electromagnetic field tensor F“”. In 
fact, if we are allowed an antihistorical perspective, we see that, given special relativity, we 
could have anticipated the packaging of two 3-vectors into an antisymmetric tensor with a 
combination of physical and mathematical considerations. 

Einstein’s special relativity forcefully unifies E and B. In chapter III.6, we spoke of 
relativistic completion. Suppose at the dawn of special relativity, you were given E and B 
and asked to complete them. Your first thought, that the 3-vector E gets promoted to be 
part of a 4-vector, cannot be correct: this would mean that under a boost, the change of E 
is a rotational scalar, but since Wersted’s great discovery in 1820, it has been known that 
E transforms into a linear combination of itself and B. Thus, the simplest guess is that 
the two 3-vectors E and B get unified into an antisymmetric Lorentz tensor. (Also recall 
exercise III.6.5.) 

The electric field and the magnetic field are unified when time and space get unified. 
To many theoretical physicists, the identification in (17), important though it is to make 
contact with experimental physics, is akin to taking an exquisite object of art and pulling 
it apart. 


Exercise 


1 Show that the Lorentz force law (18) is consistent with p? being a constant. 


Note 


1. Note that the e, can be arbitrary real numbers of either sign. To understand why the electron charge is 
precisely equal and opposite to the proton charge, you would have to learn quantum field theory. See QFT 
Nut, chapter VIL.5. 


IV.2 Electromagnetism Goes Live 


The electromagnetic field in action 


After your triumph in the preceding chapter, people naturally ask you how the field A, (x) 
is generated. Electrodynamics should be a mutual dance between particles and field. The 
field causes the charged particles to move, and the charged particles should in turn generate 
the field. 

We had the first half of this dynamics in the preceding chapter. Now we have to describe 
the second half; in other words, we are going to look for the action governing the dynamics 
of A,,(x). 

To construct the action, as we have seen many times by now, it is imperative that we 
understand all the relevant symmetries first. Lorentz invariance has to be imposed of 
course. But then a brilliant young physicist, not you this time for a change, notices a 
peculiar invariance of the term you added to the action 


i A, (x)dx" (1) 


Recall that x“(zt) here represents the trajectory of a charged particle. 


Gauge invariance 


Consider the transformation 
A, (x) > A, (x) = A, (x) +3, Aa) (2) 


for some function A(x). Then (1) changes by 


ne ee cl aah d A(x) _ q 
| Acoax =} dc“ 3,Ax)= J de = Ae) (3) 


i i Tj 


Formally, we take the beginning and end of the particle’s trajectory in the far past and 
future and assume that A(x) vanishes at infinity. Then the action does not change under 
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the transformation in (2), known as a gauge transformation. We have discovered a hidden 
symmetry of the action, called gauge invariance. 

Strictly speaking, a gauge symmetry of the type discussed here is not a symmetry, 
but rather a redundancy in the description. The statement here is that A,,(x) and A w(*) 
describe the same physics; in other words, A,,(x) contains degrees of freedom that are not 
physical, which could be removed by appropriate choices of A(x). There is a great deal 
more I could say about this rather subtle subject, but for now I am content to refer you to 
various field theory texts.! 

In the preceding chapter, the electromagnetic field strength tensor 


F(x) = dy Ay(x) acy dA, (x) (4) 


which contains the familiar electric and magnetic fields, naturally emerged when we varied 
(1). Now we understand on symmetry grounds why this particular combination must 
appear. Whatever emerges from varying a gauge invariant action has to be gauge invariant. 
The gauge potential A is not gauge invariant, but the field strength F is. Under a gauge 
transformation, we have 


Fy > d,[Ay@) + AC)] — [Ay () + 0,AX)] = Fup + 0,0,A — 0,0, A = Fyy (5) 


so that F,,, does not change. 

Thus, the action for electrodynamics we are searching for should be constructed out of 
F,,(x). To obtain a Lorentz invariant object, the simplest possibility would be to “square” F 
and contract the indices, obtaining F“” F,,,. Note that this term also contains two powers 
of the time derivative #, in line with all the actions we have encountered thus far. We 
integrate this Lorentz scalar over spacetime f d*+x(—jF"" F,,,) and add it to the action we 


had in (IV.1.21). (The factor -4 is conventional.)* 


Discovering Maxwell 


We have discovered Maxwell’s Lagrangian! 
The Maxwellian Lagrangiant £ = —} FF uv When expanded out contains a number of 
terms, in particular (d)A;)?. The two powers of dp are the same as the two powers of time 


or proper time derivative in the Newtonian Lagrangian (2 or the Lorentzian Lagrangian 
dx dx" 
Nuvdt dt’ 
The bad notation that I keep harping on is particularly bothersome here. It is impor- 


tant to carefully distinguish between dynamical variables and mere labels. In Newtonian 
mechanics, the position of the particle x(t), or better, ¢,,(t) or X,(t), is the dynamical vari- 
able. We have also indicated the possibility of having several particles, labeled by a. Here 


* Once we have discovered the possibility of including charge, as in (IV.1.23), we are free to scale A,, > AA, 
and the charges accordingly, and thus set the coefficient of F“” F,,, in the action to be any real number we like. 
+ Some purists insist on calling £ a Lagrangian density, since it is f d3xL that has the same status as the point 


m 


particle Lagrangian L = 4 (42. I would rather abuse terminology than clutter the page. 
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the dynamical variable is a field A,,(x), perhaps better written more explicitly as A,,(x, 1), 
whose time dependence we want to study. We see clearly that the x, which the dynamical 
variable A,, depends on, is a label, not a dynamical variable: it tells us which A,, we are 
talking about, namely the A, at the spatial location x. Thus, the translation table? from 
the mechanics of several particles to a field theory (such as Maxwell’s Lagrangian) contains 
a—> Xand X,(t) > A,@, 0). 

Because of gauge invariance (2), the vector field A,, contains fewer degrees of freedom 
than meet the eye, reflecting the redundancy I mentioned earlier. As you already know 
from classical electromagnetism, you have to practice the arcane art of fixing the gauge,? 
after which an electromagnetic wave has only two polarizations. In quantum field theory, 
you learn to quantize the electromagnetic field. When you do this, the electromagnetic 
field is described in terms of photons. Even though the photon carries spin 1, it has only* 
two spin states, corresponding to the two polarizations of the classical wave. 


Coupling of field and particles 


So now we have the complete action for charged particles and the electromagnetic field: 


S= | / 3 (-ma/ —Nyyd XAX? + cehcko Xt) - / d*x 1 FHF, (6) 


I have purposely written it in this peculiar form to emphasize that the nature of the 
integral differs significantly for the three terms. The first term describes free massive 
particles. The third term describes a field. The world of particles and the world of the field 
would be forever estranged were it not for the second term, which couples particles and 
field. 

This may remind you of something you have seen. Indeed, back in chapter II.3, we wrote 
down the action for Newtonian gravity consisting also of three terms: 


2 
s= far pee (Ss) - fas (Xm. — a6) - [ex zoey} (7) 


The first term describes the dynamics of the particles, the third term describes the grav- 


itational field ®(x, t), and the second term couples the particles and the field together. 
In Newtonian gravity, the field dictates how the particles move, and the particles in turn 
generate the field. 

In exactly the same way, in electromagnetism, the field dictates how the particles move, 
and the particles in turn generate the field. 


How the field moves 


To obtain the equation of motion for the electromagnetic field, we vary S with respect 
to the field A,,(x). But we already learned how to vary with respect to a field, also in 
chapter II.3, when we discussed the dynamical string described by the field ¢(t, x). The 
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only complication for the electromagnetic field is the presence of Lorentz indices, but they 
mostly just go along for the ride, so to speak. 

Let’s start by varying the integrand of the third term in (6): 6(F“" F,,,) =2F"6F,,, = 
4F "9 5A, (The reader should understand the two factors of 2 and hence the conventional 
choice of j in (6).) Thus, the variation of the third term is 


Bf dtxbPM Fy) = f ate a, FU" (x)5A, (x) (8) 


where we have integrated by parts. The first term in (6) does not depend on the field and 
so may be ignored. Thus, were the second term not present, we would have obtained, by 
setting the variation to zero, the free Maxwell’s equations 


a, FH =0 ) 


Return of the current 


But the second term in (6), indicating the presence of charged particles, is usually there. It 
is written as an integral over the particle trajectories. In contrast, the third term is written as 
an integral over spacetime. To compare the variation of the second term with the variation 
of the third term, we have to write the second term also as an integral over d*x, so that the 
two terms have the same form. But we know how to do this. Indeed, (7) provides a strong 
hint. Use the Dirac delta function introduced in chapter I.1! 

Recall from chapter III.6 that the delta function is defined by { dx 5(x — y) f(y) = f(x) 
for any reasonably smooth function f(x). There we also generalized to the 4-dimensional 
delta function, defined in an obvious way by 5“ (x — X,) = 5(x° — X°)8(xt — X})5(x? — 
X?)8(x3 — X3). In other words, 5“ (x — X,) vanishes unless x” and X/ are equal compo- 
nent by component. 

With this quick review, we now write A u(Xq) = if d*x 6O(~ — X,)A mene and thus the 
second term in (6) as 


Yo ea jf dt Ay(Xa) 


You might recognize that this form is analogous to the form of the second term in (7), 


dXt dXt 
a / d*x dite dt, 8% — XWJAWO) Fe (10) 


a 
iE, 


except that the physics is relativistic here and nonrelativistic there. 
Now vary A,,(x) to obtain 


as y e fa 54% — X,) a 5A (x) (11) 
a a a / u 
a 


Ta 
Very nicely, we see the return of the 4-current defined in chapter III.6 


IMG = Se, / dt, 8 (x — X,) 


dx 
dTq 


(12) 


We can then write (11) more compactly as d*x JE(x)6A p(x). 
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Maxwell’s equations 


Combining (8) and (11), we find 
A, FM" (x) = —J"(x) (13) 


These are of course Maxwell’s equations governing the dynamics of the field, telling us how 
the electromagnetic field is generated by the movement of charged particles in spacetime. 
To write these equations in terms of the familiar electric E and magnetic B fields, we 
simply use the identification given in IV.1.17. 
First, look at the time component (v = 0) of (13). We have 0; F!° = —d,E' =-V-E= 
—J°. Calling J° the charge density p, we recover the familiar* 


V-E=p (14) 
Next, look at a spatial component by setting v to 3, for example. Note that 0, F w3 
do F 3 + 0,F 13 + 0,F23 = 09 E> — 3,B* + 0,B'. Thus, we obtain 


Vx B=—4J (15) 


Seeing (14) and (15) emerge so naturally, you naturally wonder where “the other half of 
Maxwell's equations” are, namely the ones that do not involve the current. The answer is 
that they are actually identities.’ 

The point is that the electromagnetic field F,, is not any garden variety antisymmetric 
tensor, but the relativistic curl of a 4-vector: Fy, = 0,A, — 0,A,. To exploit this bit of 


information, let us define, in analogy with the 3-dimensional antisymmetric symbol «‘/*, 


the 4-dimensional antisymmetric symbol! e4”*” by €°!73 = 1 and the rest determined by 


208 = 2015.  60NS 0193 


antisymmetry. (For example, € —1.) Newton and Leibniz 


tell us that derivatives commute (as in (5)). Therefore, 
cH 9 FF = O90, Ag — 8,0, A,) = 2c" 0,0, A, =0 (16) 


If we write out (16) explicitly in terms of E and B, we obtain the “missing Maxwell’s 
equations.” Set yz to 3 to obtain —d9Fy7 + d9 Fy, + 2(0, Fo) — 92Fo1) = 0. Translating, we 
find 


Similarly, setting jx to 0, we find® 


V-B=0 (18) 


* We could also run the argument the other way. After reading chapter III.6 on relativistic completion, you 
could have contemplated the equation of electrostatics (14). Since V and p are both being promoted to 4-vectors, 
E has to be part of a 4-tensor. We could then discover F,,,,. 


+ Also known as the Levi-Civita symbol. Note that in d-dimensional spacetime, e“”*"""? carries d indices with 
012-+-d-1 
€ = 1. 
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One appealing feature of this approach is that the correct form of the electromagnetic 
current (12) emerges naturally. Furthermore, applying 0, to (13), we recover current 
conservation 


a,J” =0 (19) 


since F¥” = —F¥ is antisymmetric. 

The reader may be bothered, yet again, by the asymmetry between particles and field as 
manifested here in (6). The integration runs over the worldlines of the particles and over 
all of spacetime for the field. We need quantum field theory to tell us how to remove this 
asymmetry regarding how particles and field are treated. 

If you have taken a course on electromagnetism, you would recognize that this elegant 
treatment has captured the essence of the subject. 


Einstein’s legacy to physics 


When I was in high school, I came across a popular account of Einstein’s theories. Like 
the typical layperson, I was captivated by the outlandish and bizarre aspects of Einstein’s 
universe. Later, in college, after I had mastered enough physics and mathematics to 
understand Einstein’s work, I marveled at the mathematical subtleties involved, and I 
saw Einstein’s strange conclusions as perfectly logical consequences of his theory. But as 
I learned more physics and started doing research, I finally realized the true intellectual 
legacy’ Einstein bequeathed to later generations of physicists amounted to nothing less 
than a new way of doing physics. 

To appreciate Einstein’s insight, let us review the schema followed in developing that 
quintessential 19th century theory, the theory of electromagnetism. By fooling around with 
frog’s legs and wires, physicists saw that Nature behaves in a certain pattern, summarized 
by Maxwell’s equations. The equations, once written down, sing out a song, waiting 
patiently for someone with ears to hear. Finally, a bright young fellow comes along and 
hears the equations saying that they are Lorentz invariant. This fellow then realizes that 
the symmetry demands a revision of all of physics. 

After Einstein worked out special relativity, it dawned on him and some of his contem- 
poraries, Minkowski in particular, that the logical arrows in this schema may be reversible. 
Suppose that it was secretly revealed to us, in the dark of night, that the world is Lorentz 
invariant. Knowing this, can we deduce Maxwell’s theory and hence, the facts of electro- 
magnetism, without ever stepping inside a laboratory? 

Toa large extent, we can! The requirement of Lorentz invariance is a powerful constraint 
on Nature. Maxwell’s equations are so intricately interrelated by this invariance that, given 
one of the equations, we can deduce the others. Start with, say, Coulomb’s law describing 
how the electric field produced by a charge decreases as one moves away from the charge. 

We are given a symmetry that relates space to time, the electric to the magnetic. So, not 
surprisingly, we also would know how a magnetic field would vary in space. 
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Einstein taught us to deduce physics from symmetry, instead of symmetry from physics. 
In Philip Roth’s The Ghostwriter, one of the characters, a famous writer, tells another 
character that he always writes one sentence before lunch. After lunch, he turns the 
sentence around, and he spends his life turning sentences around and around in his head. 
In much the same way, theoretical physicists turn logical structures around and around in 
their heads. Einstein and Minkowski realized that one can turn the logical arrows of the 
19th century around. 

Having grasped the power of symmetry, Einstein put it to use in developing his the- 
ory of gravity. Instead of laboriously distilling this theory from a motley collection of 
experimental facts and then extracting a symmetry, he formulated a symmetry empow- 
ering him to write down his theory of gravity in one fell swoop. To appreciate this, let 
us imagine what would happen if physicists followed the 19th century schema in study- 
ing gravity, as some physicists tried to do. After years of carefully studying planetary 
orbits, astronomers would have noticed absolutely minute deviations of the orbits from 
the Newtonian prediction. To account for this, physicists would add a tiny correction to 
Newton’s law of gravity. More careful study would reveal that this is still inadequate, and 
physicists then would be compelled to correct Newton’s law by an even tinier amount. 
In practice, this program would quickly grind to a halt. But even if we imagine that 
physicists were able to determine as many correction terms as they like, it would take 
a stroke of mathematical genius to see that the corrections would all combine to pro- 
duce a rather different theory. The theory in the intermediate stage would be a compli- 
cated mess. 

I regard Einstein’s understanding of how symmetry dictates design as one of the truly 
profound insights in the history of physics. Fundamental physics is now conducted largely 
according to Einstein’s schema rather than that of 19th century physics. Physicists in 
search of the fundamental design begin with a symmetry, then check to see whether its 
consequences accord with observation. But howis a physicist to get to square one in playing 
Einstein’s game? the reader might ask. Presumably, no one is going to come in the dark 
of the night and whisper to us the symmetries Nature has woven into her tapestry. If 
an architect’s client wants to have symmetrical designs but won't tell the architect what 
symmetry he has in mind, how is the architect to find out? 

Well, physicists can extract the symmetry from known experimental facts. That is what 
Einstein did. The difficult part is to decide on the one most relevant fact that allows 
formulation of a symmetry. Out of the many facts known about gravity, Einstein fastened 
onto, as we will see, the fact that objects fall at the same rate, regardless of mass. He 
did not use, for example, the fact that the gravitational attraction between two objects 
varies inversely as the square of the distance between them. This and all other known 
facts emerge, as we will see in detail in Book 2, as consequences of the symmetry imposed 
on gravity. 

An interesting historical fact is that some of Einstein’s contemporaries, such as Lorentz, 
who had been struggling to produce a dynamical theory of the electron and of the ether, 
thought Einstein had cheated. Einstein simply imposed the principle of relativity and 
deduced the consequences. These other physicists felt that the principle of relativity should 
emerge from the dynamics. 
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Perhaps some of the biggest puzzles of contemporary physics are waiting for a 


principle—a principle to be imposed, not derived. 


Exercises 


1 


Show that £ = 1(E? - B2). 


2 (a) Show that the symmetric tensor defined by 
THY = FURY. — dn Fy FO? (20) 
satisfies the conservation law 0,7” = 0, in the absence of charged particles of course. 

(b) Since charged particles and the electromagnetic field interact, that is, the particles and the field can 
exchange energy and momentum, we would not expect 4,7” = 0 to hold in the presence of charged 
particles. Add the energy momentum tensor T/”.__ of point particles defined in (III.6.7) to the en- 

particles 
ergy momentum tensor in (20), which we will now call Ti” _.. Show that the resulting energy 
electromagnetic 
momentum tensor TH? = 7"? + TH” _. Satisfies 0,74" = 0. 
particles electromagnetic Lu 
Later, in chapter VI.4, when we get to energy momentum in curved spacetime, this will become much 
clearer with a more powerful formalism. 

3 Show that, in the absence of charged particles, TO — 1(E? + B2), the standard expression for the energy 
density of an electromagnetic field. This suggests that T”” is the energy momentum tensor of the electro- 
magnetic field. We will show that this is in fact the case in chapter VI.4. 

4 Calculate a97° = ; ee (E? + B) using the standard Maxwell’s equations. 

5 Evaluate the dual electromagnetic tensor F, wy = FE uvig EF”. Explain why it is called dual. 

6 Derive the identity n,, FMF’? — FMA FY?) = 5nt F PF. 

7. Use the identity in the preceding exercise to show that the energy momentum tensor of the electromagnetic 
field can be written in the symmetric form 

Te = ig (FMF? aii FIA prey 

8 Show that we can construct only two scalars that are quadratic in the electromagnetic field. Identify them in 
terms of E and B. 

9 Show that the T“” for the electromagnetic field as given in exercise 2 has zero trace. 

10 If you remember what the virial theorem in classical mechanics® is, you may be wondering about its rela- 
tivistic generalization. Consider a system consisting of charged particles interacting with the electromagnetic 
field. Assume that the motion of the particles is confined to a finite region and that the electromagnetic field 
vanishes at spatial infinity. Physically, this means that we do not allow electromagnetic radiation to escape to 
infinity. Mathematically, we can then freely integrate by parts. Calculate the trace of the energy momentum 
and from this deduce the relativistic virial theorem. Hint: This is not an easy problem; you need to use some 
results from chapter III.6. 

Notes 

1. For example, S. Weinberg, The Quantum Theory of Fields; QFT Nut; and so forth. 
2. See QFT Nut, p. 19. 
3. We will do this for gravitational waves in Part IX. 
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4. Normally in quantum mechanics, a spin 1 particle has 3 spin states. For a discussion of why a massless 
spin 1 particle has only 2 spin states, see, for example, QFT Nut, chapter III.4. 

5. Whether they are equations or identities depends on whether you regard A,, or F,,, as fundamental. In the 
quantum world, you are forced to treat A,, as fundamental. 

6. See QFT Nut, chapter IV.4. 

7. This section is adapted from pp. 95-100 of my popular book Fearful, written for the educated public. 

8. See, for example, H. Goldstein, Classical Mechanics. 


IV.3 Gravity Emerges! 


Forced to a tensor field 


Now that we have dealt with electromagnetism, we turn to gravity, or rather option G in 
chapter IV.1. Recall that you obtained 


Option G improved: S=-—m / [= 8py(x)dxtdx” (1) 


with gog = —(1+ av), 801 = &i0 = 0, and gj; = 4;; as a special case. 

Remarkably, if you put the Newtonian potential term Vdt outside the square root, you 
are led to a vector field A,,, but if you put it inside, you are forced to a tensor field: two 
indices are needed to match dx“dx”. In a sense, we have to thank Pythagoras for this 
tensor field. 


Time and gravity 


For the moment, let’s treat the special case 


seam f [(1+™) ae—ae @) 


The fraction V/m looks a little strange, but then you suddenly realize that if the particle, 


of mass m, is living in a gravitational potential V = —GMm/r, m would cancel out, so that 


s=om f (1-22!) ae ae @) 


That the mass m cancels out depends of course on the profound observational fact that the 


inertial mass and the gravitational mass are equal, as we have alluded to already several 
times, in chapters I.1, II.3, and so forth. 
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Suppose this particle is actually a clock, sitting still in the potential (so that dx = 0). 
Then S =—m { (1 - 26M) dt?~ —m f(1— &“)dt. But for a particle at rest, this is just 


—m f dt with t the proper time. Hence, dt = (1— GM) ay or At = At/(1— GM) > At. 
You have discovered that in a gravitational field, a clock runs slow. 
Gravity affects the flow of time! An astounding statement. 


Universality of gravity 


In sounding a bad notation alert, in several instances, we used the trick of considering 
several particles instead of a single particle to render the notational defect glaringly obvious. 
For example, using this trick in the discussion leading up to (IV.1.21,23), we were led to the 
notion of electric charge. So, consider a bunch of particles with different masses. Instead 
of (2), we have 


a! LCA eae 
c= Tome f (r+ ee ) ay dx? (4) 


A serious conceptual problem becomes apparent: unless the potential evaluated at x, is 


proportional to m,, particles with different masses would experience the passage of time 
differently. Ultimately, we have to ask our experimental colleagues, of course, if they know 
of such an effect. They don’t, and so we can say with some confidence that (4) does not 
describe the physical world as we know it unless V(x,), the potential experienced by the 
particle, is proportional to m,. Remarkably, the gravitational potential has precisely this 


property. 


Curved spacetime came looking for us 


Of course, this assertion that the gravitational mass and inertial mass are equal has 
to be tested by performing experiments to ever increasing accuracy. Our experimental 
colleagues have assured us (and continue to assure! us) that, yes, this is indeed the case; 
therefore, we can describe a bunch of relativistic particles in a gravitational field by the 
action 


s=- dom, / V Spy %adxedx? (5) 


with g,,,(x,) independent of the properties of the particle a, such as its mass. 

Now comes your (actually, Einstein’s) profound insight. Aha, you say, this looks just like 
the length of different curves in curved spaces that we discussed in part I, except that here 
we have both space and time. So particles in a gravitational field move as if they were in 
curved spacetime! 

We did not go looking for curved spacetime. Curved spacetime came looking for us! 
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This represents one of the quickest ways I know of introducing curved spacetime. 
Curved spacetime follows from your desire to stick the Newtonian potential V inside the 
square root in (2). 

To summarize, the interpretation of gravity as the effect of curved spacetime is possible 
only because gravity is universal. Thanks to the equality of gravitational mass and inertial 
mass, the effect of a gravitational field on the motion of particles does not depend on the 
particle, whether it is an apple or a rock. 

This is not how gravity was discovered on the planet Terra. But as 1 mentioned earlier, 
I can imagine a civilization (in a molecular cloud?) without a Newton, without apples and 
rocks, which somehow discovers that light travels at a universal speed. Along comes some 
bright young theorist who tries to stick the potential term inside the square root. He or 
she (or whatever) would then discover universal gravity. 


Gravitational redshift 


Let’s go back to the prediction that gravity slows down the flow of time. The universality of 
gravity means that all clocks, regardless of manufacturer, slow down by the same amount. 

The Smart Experimentalist* pipes up: “But how can you observe this effect if all physical 
processes at a given point slow down by the same factor?” 

Hmm, well yes, we are stumped. But she is just thinking out loud, and continues, “We 
compare the flow of time at different points! We could send a signal, say a photon, from 
here to there in a gravitational field. If clocks run at different rates at different places in a 
gravitational potential well, then the frequency of a photon climbing out of a gravitational 
well would be shifted toward the red.” 

Excellent suggestion! More on this prediction of gravitational redshift by Einstein later! 


A recurring theme of modern physics 


Let’s summarize how you discovered electromagnetism in the preceding chapters. Gener- 
alizing the potential term V(x)dt to A,,(x)dx", you uncovered a hidden gauge invariance, 
which then completely fixes the form of the electromagnetic field F,,,,(x) = 0,,A,(x) — 
d,A,,(x) and subsequently the action governing its dynamics. The long trudge (not to 
mention the drudge) through the electromagnetic courses you took “merely” amounts 
to studying this dynamics for ever more involved situations involving wires, conducting 
plates, frog’s legs, and so forth. 

In discovering gravity, you are led by Lorentz invariance to the action (5). Being a 
graduate of part I of this book, you immediately recognize that this action enjoys an 
even richer “hidden” invariance: we can transform x — x(x’) so that g,,,(x)dx"dx" = 
Sip (x')dx'?dx'” and leave the action invariant. Much of general relativity is concerned 


* Like Confusio, also a character from QFT Nut. 
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with this freedom to make coordinate transformations. Your next task, in analogy with the 
electromagnetic story, is to exploit this invariance to find the action governing the dynamics 
of the gravitational field, the analog of Maxwell’s action for the electromagnetic field. 

A recurrent theme of modern theoretical physics has been unification and the resulting 
discovery of ever deeper invariances in the laws of physics. 


Note 


1. Read about the work of E. Adelberger and the E6t-Wash group at the University of Washington: http://www. 
npl.washington.edu/eotwash/. See also the discussion and the endnotes in chapter I.1. 


Recap to Part IV 


Living in a galaxy far far away, you admire the beauty of the relativistic free particle action 
and contemplate how to deprive the particle of its freedom. 

As far you can see, there are only two options: put the potential either outside, or inside, 
the square root. Two, and (apparently) only two, options. 

With the first option, you discover how charged particles hear the electromagnetic field, 
and even better, you also discover the gauge principle. Understanding the gauge principle, 
you can understand how the electromagnetic field responds to the movement of charged 
particles. 

Intoxicated by your success, you go on to discover “half of gravity” by sneaking the 
potential into the square root. You then come to the astonishing insight that gravity slows 
down the flow of time. You sure are a smart guy, no doubt about it. 

Hindsight is oh so easy. 


BOOK TWO 
From the Happiest Thought to the Universe 


| Prologue to Book Two | to Book Two 


The Happiest Thought 


The happiest thought of his life 


| was sitting in a chair in the patent office in Bern when all of 
a sudden a thought occurred to me: “If a person falls freely he 
will not feel his own weight.” | was startled. This simple thought 
made a deep impression on me. It impelled me toward a theory 
of gravitation. 

—A. Einstein 


One November day in 1907, Einstein had what he later called the happiest thought! of 
his life. In 1905, he had his annus mirabilis, producing five papers* that shook physics 
to its foundation, including not only the papers founding the theory of special relativity 
but also the paper on the photoelectric effect that helped establish quantum physics and 
introduced the concept of the photon. You would have thought that Einstein would have 
been made a full professor on the spot if the physics community had any sense at all. Well, 
he was indeed promoted, a year later, to technical expert second class in the patent office. 


An April Fools’ prank 


To understand what the daydreaming second class expert was happily thinking about, 
let’s play an elaborate April Fools’ Day prank on one of our friends. While the guy is 
asleep, put him in a spacious box elaborately furnished inside to look exactly like his living 
room. We then drop the box from a high-flying airplane (see figure 1). When our friend 
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Figure 1 A living room falling. 


wakes up, he thinks that he is in his living room. Curiously, though, he feels that he is 
floating.’ To an observer on the ground, our friend and his living room are hurtling toward 
a crunching rendezvous with the ground. Our friend, however, is blissfully unaware of the 
impending disaster. Since he is accelerating downward at the same rate as the box and all 
the objects contained inside, he feels that he is not moving downward at all relative to his 
surroundings. A slight spring in his step and he finds himself drifting toward the ceiling. 
He feels that he is floating. But this action is interpreted by the ground observer quite 
differently: our friend, by stepping on the floor, has at the same time decreased slightly 
his downward velocity and increased slightly the box’s downward velocity. He thinks he 
is floating upward but in reality his downward plunge is accelerating at the same rate as 
before. 

Indeed, this awfully unethical April Fools’ joke has already been tried: we put astronauts 
inside a box called a spaceship and drop it out of the sky. To be humane, we give the box 
a forward motion so that as soon as the box drops, the ground would have the good sense 
of curving away by just the right amount, so the box stays up at the same altitude. When 
you see on TV an astronaut floating in space, with the announcer commenting in the 
background that the astronaut is in the zero-g environment of space 100 miles above our 
heads, you of course know better, since you are capable of reading this book. The astronaut 
is in a 0.95-g environment, subject to only 5% less gravity than we are. He is floating 
because he is falling, and because he is falling, he does not feel that he is falling, just as 
the young technical expert second class thought. 
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A birthday toy 


On Einstein’s 76th and last birthday in 1955, his neighbor Eric Rogers presented him with 
a toy* constructed for the occasion. (In figure 2a, I show the engineering drawing for the 
toy, and in figure 2b, a photo of the toy constructed for me by Louis Grace.) Basically, a 
spring tries to pull a ball hanging limply outside into a cup but is too weak to do so. The 
challenge is to get the ball into the cup. The historian of science I. Bernard Cohen visited 
Einstein not long after, and he wrote: 


At last I was taking my leave. Suddenly Einstein turned and called “Wait. Wait. 1 must show 
you my birthday present.” Back in the study I saw Einstein take from the corner of the room 
what looked like a curtain rod five feet tall, at the top of which was a plastic sphere about 
four inches in diameter. “You see,” said Einstein, “this is designed as a model to illustrate the 
equivalence principle. . . . ” A big grin spread across his face and his eyes twinkled with delight 
as he said, “And now the equivalence principle.” Grasping the gadget in the middle of the long 
brass curtain rod, he thrust it upwards until the sphere touched the ceiling. “Now I will let it 
drop,” he said, “and according to the equivalence principle there will be no gravitational force. 
So the spring will now be strong enough to bring the little ball into the plastic tube.” With that 
he suddenly let the gadget fall freely and vertically, guiding it with his hand, until the bottom 
reached the floor. The plastic sphere at the top was now at eye level. Sure enough, the ball rested 
in the tube. 


Figure 2. (a) An engineering drawing of the old man’s toy. (b) The toy constructed for the author, 
shown in its two states. 
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Much more about the equivalence principle later, but for now, note that the ball is just 
as easily fooled as an astronaut. When Einstein let his toy fall, the ball, precisely because it 
was falling, did not feel any gravity; the ball was the stand-in for the falling person in the 
patent clerk’s daydream.° The spring, normally too weak to pull the ball up against gravity, 
now seized the chance to yank the ball into the cup. 


The falling candle 


Einstein loved to pop playful little puzzles on his visitors. He was equally delighted whether 
or not they knew the answers. If they didn’t, he would get a big kick out of explaining it. 
Let’s see if you figure this one out. Suppose you have just lighted a candle in an elevator 
when, unfortunately, the cable breaks. The elevator falls freely. What happens to the candle 
flame? Try to answer the grinning old man looking at you with a twinkle in his eyes. 


“In proportion to its quantity” 


After dinner, the weather being warm, we went into the garden and drank thea, under the 
shade of some apple trees,* only he and myself. Amidst other discourse, he told me he was 
just in the same situation, as when formerly, the notion of gravitation came into his mind. 
It was occasion’d by the fall of an apple, as he sat in a contemplative mood. Why should that 
apple always descend perpendicularly to the ground, thought he to himself. . . . Assuredly, the 
reason is, that the earth draws it. There must be a drawing power in the matter: and the sum of 
the drawing power in the matter of the earth must be in the earths center, not in any side of the 
earth. Therefore dos this apple fall perpendicularly, or towards the center of the earth. If matter 
thus draws matter, it must be in proportion of its quantity. Therefore the apple draws the earth, 
as well as the earth draws the apple. That there is a power, like that we here call gravity, which 


extends its self thro’ the universe. [W. Stukeley,” in his memoir of Sir Isaac Newton] 


To understand gravity in more detail, let us consider our April Fools’ prank again. For 
the prank to work, it is crucial that all objects fall at exactly the same rate. Suppose to 
the contrary that the box falls faster than our friend. Then our friend would find himself 
pinned to the ceiling, which he would interpret as being due to the presence of a force 
pushing him up. Conversely, if the box were to fall slower, our friend would feel a force 
pulling him to the floor. The extreme case in which the box is not falling at all is of course 
the normal situation, with the box resting on the house foundation. 

That objects all fall at the same rate regardless of their composition is contrary to 
everyday intuition, but as Galileo suspected, our everyday experiences are distorted by 
air resistance. As you know, and as I explained in chapter I.1, inertial mass is equal to 


* Supposedly, a descendant of Newton’s apple tree® now stands outside Trinity College in Cambridge, England. 
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gravitational mass, and so the motion of an object in a gravitational field does not depend 
on its mass: in a vacuum, a feather and a cannonball fall at the same rate.” 

Physics students generally identify Einstein as the person who brought fame to various 
gedanken experiments. But in fact thought experiments go way back to Galileo, at least. 
The following is taken straight from his “Discorsi e dimostrazioni matematiche” (1628): 


Salviati: If then we take two bodies whose natural speeds are different, it is clear that on 
uniting the two, the more rapid one will be partly retarded by the slower, and the slower will 


be somewhat hastened by the swifter. Do you not agree with me in this opinion? 
Simplicio: You are unquestionably right. 


Salviati: But if this is true, and if a large stone moves with a speed of, say, eight while a smaller 
moves with a speed of four, then when they are united, the system will move with a speed less 
than eight; but the two stones when tied together make a stone larger than that which before 
moved with a speed of eight. Hence the heavier body moves with less speed than the lighter; 
an effect which is contrary to your supposition. Thus you see how, from your assumption that 
the heavier body moves more rapidly than the lighter one, I infer that the heavier body moves 


more slowly.!® 


Universality of gravity and a ball of whiskey 


Let’s try to imagine the patent clerk’s train of thought. A falling person does not know he is 
falling, because everything around him is falling at the same rate, in other words, because 
of the universality of gravity. Can I turn this around? Gravity must be universal because a 
falling person does not know he is falling. In a way, falling cancels out gravity. Hmmm, 
suppose I somehow reverse falling by thrusting upward. Can I then produce gravity? Aha! 

To understand what Einstein had in mind, let us inflict an even more elaborate April 
Fools’ joke on our friend. This time, while he is asleep, we put him inside a box and fly 
him deep into intergalactic space, far away from any gravitational field of force. Now rev 
up the engine and accelerate the whole contraption at a constant rate. When he wakes up, 
he notices nothing unusual at all. No floating sensation this time. He drops an apple, and 
it promptly falls to the floor (figure 3). But to an outside observer, floating in space and 
watching the spaceship go zinging by, the dropped apple is actually floating in space in 
happy ignorance of the fact that the floor is rushing at it with ever-increasing speed. If we 
accelerate the rocket at precisely the right rate, our friend would see the apple falling to 
the floor exactly as if he were back on earth. Keep in mind the key phrase “as if” for future 
reference. 

By accelerating the rocket—in effect, reversing free fall—we can produce gravity. Clearly, 
if our friend had dropped a stone as well as the apple from the same height from the floor, 
the stone and the apple will “fall” and hit the floor at exactly the same instant. 

But what is to him a mysterious universality is laughably obvious to the observer floating 
about outside: the floor is moving up to meet the floating apple and stone and so obviously 
arrives at the two objects at the same time. 
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Figure 3 The floor rushing up to meet the apple. 


In one of the Tintin stories, Captain Haddock has smuggled on board a spaceship a bottle 
of whiskey hidden inside a hollowed-out book on cosmology.!! Just as he was about to set 
lips to glass, a bumbling character named Thomson accidentally turns off the spaceship’s 
engine. The spaceship stops accelerating. The whiskey, suddenly feeling no gravity, has 
no further compunction to stay inside the glass: it exploits surface tension to curl itself 
up into a ball and floats out of the glass. Tintin then manages to turn the engine back on. 
The spaceship accelerates. Gravity comes back on, and Captain Haddock and the ball of 
whiskey crash to the floor. 

Confusio: “I get it! So, an apple and a stone dropped from the Leaning Tower of Pisa did 
not fall, but were suspended motionless in space. It was actually the ground which rushed 
up to meet them! That would explain why the apple and the stone hit the ground at exactly 
the same instant. The relativity of motion!” 

Indeed, a mite on the ground looked up at the enormous apple coming down and saw his 
entire life flash by him in an instant, but a mite on the apple, equally terrorized, watched 
the ground rushing up to crush her. 


“As if” is good enough 


But yet, this sounds like total nonsense. The earth carrying the Leaning Tower and the 
entire town of Pisa rushing up toward the apple and the stone? How could you explain 
gravity with that peculiar hallucination? Besides, all around the world, people are dropping 
things, ripe fruits are falling down from trees, and nerdy physicists are tripping all over 
themselves. The earth would have to be rushing this way and that. 
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Nevertheless, the notion of the ground rushing up to meet the apple and the stone is 
such an amazingly simple explanation of why the apple and the stone hit the ground at 
exactly the same instant—there must be some element of truth to it. 

To make sense out of nonsense, the key insight is that “as if” is good enough. The 
ground does not literally have to rush up. It is enough to say that gravity behaves as if the 
ground were rushing up. We can formulate this more academically by calling the “as if” 
an equivalence. 


Einstein’s equivalence principle 


Dear reader, together we have arrived at Einstein’s equivalence principle. This profound 
and fundamental principle states that, in a small enough region of spacetime, no experi- 
ment can tell us whether we are in a gravitational field or in an accelerating frame. 

Note the caveat of a “small enough region of spacetime,” where by small enough we 
mean a region small compared to the characteristic size of the gravitational field. This is 
easy to comprehend. Suppose our friend dropped the stone and the apple on earth rather 
than in a box accelerating in empty space. We eternal students of physics, not to mention 
Newton and his friend Stukeley, know that the stone and the apple do not fall down, but 
that they fall toward the center of the earth (recall chapter 1.2). Indeed, as the stone and 
the apple fall, they approach each other slightly, the effect being suppressed by the ratio 
of the separation between the stone and the apple to the radius of the earth. By careful 
measurement of this so-called tidal effect (recall chapter 1.4), we can in fact determine 
whether we are in the earth’s gravitational field or in an accelerating box. 

The equivalence principle is a statement about physics in a small region of spacetime. 


Exercise 


1 What happens to the flame of a falling candle? 


Notes 


1. This prologue is based in part on A. Zee, An Old Man’s Toy. 

2. Einstein, Einstein’s Miraculous Year. 

3. You can now experience this for yourself if you are willing to pay. Einstein’s happy thought is being exploited 

commercially. See www.gozerog.com. 

4. A. Zee, An Old Man’s Toy. See also A. Zee, in E = Einstein, ed. D. Goldsmith and M. Bartusiak, Sterling, 2007, 
p. 223. 

. I. Bernard Cohen, “An Interview with Einstein,” Scientific American, July 1955, p. 73. 

. For a brief discussion of acrophobia, elevator phobia, and daydreams, see Toy/Universe, pp. 17, 257. 

. See Toy/Universe. 

. For a sketch of Newton’s life and his encounter with that famous apple, see Toy/Universe, pp. xv-xvii. 


OON DUM 


. Now you can see this amazing fact on the web. 


10. S. Drake, Galileo Galilei, Two New Sciences. Copyright © 1974 by the Regents of the University of Wisconsin 
System. Reprinted by permission of The University of Wisconsin Press. 


11. By Lemaitre? Could have been a book on gravity, I’m not sure. See Toy/Universe, p. 15. 


| Part V_ Equivalence Principle and Curved Spacetime 


Spacetime Becomes Curved 


A mysterious force emanating from the Bering Strait 


Imagine flying from Los Angeles to Taipei. Flipping idly through the back of an in-flight 
magazine (or more likely the flight map on the video by the time I finish this book), you 
might notice that the plane follows a curved path arcing toward the Bering Strait. Is the 
Bering Strait exerting a mysterious attractive force on the plane? 

On your next trip you try another airline. This pilot follows exactly the same curved path. 
Don’t these pilots have any sense of personality or originality? Why don’t they sometimes, 
just for the heck of it, swing south and fly over Hawaii, say? They seem to prefer to fly 
over! grim and unsuspecting Inuit hunters rather than cheerful Polynesian maidens. 

Not only is the mysterious force attractive, it is universal, independent of the make of the 
airplane. Should you seek enlightenment from the guy sitting next to you? Dear reader, 
surely you are chuckling. You know perfectly well that the Mercator projection distorts 
the earth, and pilots follow scrupulously the shortest possible path between Los Angeles 
and Taipei. The answer to the universality of the mystery force is to be sought, not in the 
physics, but in the economics, department. 

Butis it so laughably obvious? Consider the leading theoretical physicists before Einstein 
came along. They knew the well-verified experimental fact that all things fall at the same 
rate, be it an apple or a stone. Perhaps the fact that an apple and a stone would fall in 
exactly the same way in a gravitational field is no more amazing than different airlines, 
regardless of national or political affiliation, would choose exactly the same path getting 
from Los Angeles to Taipei. An apple or a stone traverses the same path in spacetime, just 
as a commercial flight follows the same path on the curved earth regardless of the airline.” 
In hindsight, we might see an “obvious” connection, but hindsight? is of course way too 
easy. For three hundred years, the universality of gravity has been whispering “curved 
spacetime” to us. 

As I said in chapter IV.3, we did not go looking for curved spacetime, curved spacetime 
came looking for us! 
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No gravity, merely the curvature of spacetime 


Just as there is no mysterious force emanating from the Bering Strait, one could say that 
there is no gravity, merely the curvature of spacetime. The gravity we observe is due to the 
curvature of spacetime. More accurately, gravity is equivalent to the curvature of spacetime, 
or gravity and the curvature of spacetime are really the same thing. 

To summarize and to underline the point, Einstein says that spacetime is curved and that 
objects take the path of least distance in getting from one point to another in spacetime. 
Environment dictates motion. The curvature of spacetime tells the apple and the stone to 
follow the same path from the top of the tower to the ground. The curvature of the earth 
tells the pilots to follow the same path from Los Angeles to Taipei. 

This amazing revelation about the role of spacetime offers an elegantly simple explana- 
tion of the universality of gravity. Gravity curves spacetime. That’s it. Spacetime is curved 
and gravity’s job is done. It’s now up to every particle in the universe to follow the best path 
in this curved environment. This explains why gravity acts indiscriminately on every parti- 
cle in exactly the same way. Next time you take a nasty fall, whether on the ski slope or in the 
bathtub, just think, every particle in your body is merely trying to get the best deal for itself. 


Acceleration 


In the prologue to book two, you read about Einstein’s happy thought that being in an 
accelerated frame is equivalent to being in a gravitational field. The parable here suggests 
that gravity is a manifestation of curved spacetime. Let us now substantiate these analogies. 

Warm up with the simplest example of a freely moving Newtonian particle in one spatial 
dimension obeying ay = 0. (Io avoid writing primes in the subsequent discussion, we 
call the spatial coordinate y instead of x’.) Let us now transform to an accelerated frame. 
Instead of the linear Galilean transformation, we now have y = x — 5at?. Differentiating 
twice, we obtain 


Py, ‘i 

dt? dt? 
Thus, the observer in the accelerated frame insists that there is a force, given by the mass 
m of the particle times 

2 

= a (2) 
Note that the force is proportional to the inertial mass. Simple yet profound! 

This is all familiar stuff, experienced often in daily life. Riding in a speeding car, we are 
thrown forward when the driver suddenly slams on the brake. Beginning physics students 
learn this as the “effect of inertia.” A wet dog shaking itself dry knows how to exploit this 


effect. 
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Repeat this little discussion for a relativistic particle. Suppose that an observer, living 


in Minkowskian spacetime with dt? = —n poty?dy’, sees no force acting on a particle, 
that is, 
d*y? 

=0 3 

7) (3) 


What does the other observer see? 

Instead of the simple relation between y, x, and ¢ in our warm-up Newtonian example, 
we now let the coordinates y? be related to the other observer’s coordinates x“ by a 
general coordinate transformation specified by 4 functions y’ (x). Now it’s just a matter of 


arithmetic, albeit highbrow arithmetic, to work out what (3) implies for a ~- . Just plug in, 
then chug away. Differentiating y” once, we get 
dy? _ dy? dx" (4) 
dt ax! dt 
Differentiating y’ twice, we get 
d*y? _ dy? d2xt! a2y? dx! dx” (5) 
dt — ax dt? — axt#dx” dt dt 
2.) 
Thus, if one observer sees a freely moving particle iy = 0, the other sees 
d?x* ax* Aye dx" dx” 
+ =0 (6) 
dt? dy? OxHOx” } dt dt 


mee ar h @yP Ds 
We have multiplied by a and used oe = gx = S. The “x observer” sees a force 


; 2h ; Ha 
acting on the particle: a does not vanish. Compare (2) and (6). The latter is just a more 
complicated version of the former; the physics involved is essentially the same. 


Curved space and curved spacetime 


“But wait a minute!” you exclaim. “It all looks familiar. Didn’t we see this somewhere 
already?” 

Yes indeed, way back in chapter I1.2, Professor Flat explained that, ina locally flat region, 
the geodesic equation for a curved space reduces to the equation for a straight line, as 
anybody would expect. There we didn’t include time, and the geodesic is parametrized 
by the length defined by ds? =6,,dy?dy?; here the path followed by the particle is 
parametrized by the proper time defined by dt” = n,,dy?dy*. The role of Euclid’s 5 is 
played by Minkowski’s n. 

The metric for curved spacetime can hardly wait to pop out. Given that the “y observer” 
sees the Minkowski metric, what metric does the “x observer” see? The invariance of the 
proper time interval gives instantly 


dy? ay? 
2 P Ayo — y y 
dt” = Npgdy?dy ae ee pe 


dx"dx” (7) 
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Thus, the “x observer” sees the metric 


dy? dy? 
Suv = "Noo axe ax” (8) 


Define the Christoffel symbol as 


, _ Ox* ary 


uy “ay? axkax? 


(9) 


and you literally see (6) morph into the geodesic equation (II.2.19) found by minimizing 
the length of a curve (II.2.13) before your very eyes. 

I cordially invite you to go back to chapter [1.2 and compare our discussion of curved 
spacetime here with the discussion of curved space there. Everything looks the same. For 
example, the connection between i and the metric goes through as before. Observe that 
we are on the right track: i , depends on the second derivative of y with respect to x. 
Thus, if the relationship between y and x is linear, as given by a Lorentz transformation 
or a simple rotation, then indeed ax = 0, and there is no gravitational force. Comparing 
(8) and (9), you see that re can be constructed out of the metric and its first derivatives 
Ivpv» just as in chapter IT.2. 

The astute reader would recognize that the discussion of the geodesic equation in 
chapter II.2 is just the mathematician’s version of the physicist’s equivalence principle 
that we can always go to an inertial frame in which there is no gravity. Translation: locally 
flat coordinates = inertial frame, and Christoffel symbol = force attributed to gravity. 

We will explore motion in curved spacetime in detail in the next two chapters. 

Since the laws of arithmetic are reversible, we can reverse the logic. Here we start with 
a particle happily cruising in flat spacetime (3), free from the demand of any force. We 
then make an arbitrary coordinate transformation, writing y(x) for y. Apply the chain rule 
of elementary calculus, and we discover the geodesic equation (6) and curved spacetime 
(8). Now reverse the logic: start with a particle in a gravitational field, go to a locally flat 
region of spacetime (sometimes called a locally inertial frame) in which ry. , vanishes, 
and watch (6) simplify dramatically to the motion of a freely moving particle (3). This 
is of course Einstein’s happy thought again: a freely falling person does not feel any 
gravity. 


Fictitious forces 


Some older books call the force in (2) an “inertial force,” but wait, particle theorists assure 
us that there are only the strong force, the weak force, the electromagnetic force, and the 
gravitational force. So, what is an inertial force? 

In high school, I was also terribly confused* by the centrifugal force. The book said 
the force was “fictitious,” but yet I remembered that as a little kid riding the Merry-Go- 
Round, I was told in no uncertain terms that I must hold on tight. I certainly felt all these 
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fictitious forces. Even more puzzling: the book went on to mention the centripetal force. 
Why couldn’t the book make up its mind? Is it fugal* or is it petal? 

You know the resolution of all this confusing talk: the book was just moving bits and 
pieces of ai back and forth between the right hand side and the left hand side of ma = F. 
What you call force I could call a piece of the acceleration, and what you call centripetal 
I could call centrifugal. Einstein’s insight was that the most commonly experienced force 
of all, the gravitational force, may be an example of a “fictitious force.” 


Exercise 


1 A helium balloon is attached to a child’s seat in the back of a car. When the speeding car suddenly brakes to 
a stop, how does the balloon move? 


Notes 


1. 1am abusing geography slightly in the same way I occasionally abuse notation. 

2. Referring back to (IV.3.4), we see that if the gravitational mass is not equal to the inertial mass, this would 
correspond to, in our analogy, different airplanes seeing a different curvature of the earth. 

. Staircase wit, l’esprit d’escalier, Treppenwitz; firing the cannon after the cavalry had already charged by you. 

. I can’t blame the teacher since I did not get to take a physics course. 

. The next line is “Sic transit gloria mundi’. 

. The next line is “Gather ye rosebuds while ye may.” 


ND BW 


. The root “petere,” for “to desire” or “to seek,” appears in words like “appetite” and “petition.” 


* As in? “Tempus fugit.” 
T As in® “All flowers wilt’—just kidding.’ 
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The equivalence principle predicts 


We learned in the prologue to book two (and in the preceding chapter) that, in a small 
enough region of spacetime, no experiment can tell us whether we are in a gravitational 
field or in an accelerating frame. 

Einstein’s theory of gravity is built on this equivalence principle. As I suggested heuristi- 
cally, through thought experiments and parables, the equivalence principle leads us directly 
to an understanding of the gravitational field as a manifestation of curved spacetime.' At 
this point, many textbook authors start to wring their hands, fretting about the long road 
ahead, and warning their readers about the considerable mathematical machinery involved 
in mastering Riemannian geometry. Of course, all this is true; to say the contrary would 
be like saying that you could master Newtonian mechanics without learning calculus. But 
by arranging the material so that you started by learning to hop from coordinate trans- 
formations to curved surfaces, I hope that by now you have already absorbed enough of 
the relevant mathematics so that the rest of the road we have to travel will not look so 
formidable. 

Before we start developing Einstein gravity, I want to mention that two of its most 
striking predictions, namely the deflection of light and the gravitational redshift, follow 
directly from the equivalence principle. 

Recall from the prologue to book two that there are two distinct April Fools’ thought 
experiments we can contemplate doing. In one, we put our friend in a box in empty 
space, far from any gravitational field, and accelerate the box. In the other, we drop the 
box in some gravitational field, such as that of the earth. It would be instructive to derive 
both predictions using the two different thought experiments, which I will refer to as 
“Accelerated” and “Dropped,” respectively, sort of how a funding agency would file away 
these two kinds of experiments. 
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Bending of light 


Thought experiment “Accelerated” 


Our friend drops an apple, and it promptly falls to the floor. The apple does what it has 
always done. But to the observer floating outside, the floor is rushing, “rushing up” as 
we are almost tempted to say, with ever increasing haste, to meet the apple. Then our 
friend, quite a quirky person, fires a laser gun at a wall. He notices that the red spot is 
located below the mark he aimed at. No question about it, light falls in a gravitational 
field! The laser beam bends in a graceful parabola, like any material object thrown at that 
wall. 

But the outside observer would describe what happened quite differently. He sees the 
laser beam moving in a straight line, since there is no gravitational field around. But the 
wall had moved “upward” in the time it took the beam to cross the room. See figure 1a. 
Amazingly simple and beautiful argument! The equivalence principle settles, once and for 
all, the question whether light falls. 

It is worth remarking that, while this argument, often given in popular physics books,” 
establishes that light falls, it does not determine the actual amount precisely. The reason is 
that it does not take the intrinsically relativistic nature of light into account; the argument 
would apply even if the laser beam consists of a stream of tiny particles obeying Newtonian 
mechanics and moving at speed c, as in Newton’s corpuscle theory of light. In chapter VI.3 
we will do the calculation in Einstein’s theory and show that the amount of bending is twice 
the Newtonian value. 


laser light 
hits here 


laser light 
Coy fy fy ey hits here 


} iy 
(a) (b) 


Figure 1 Firing a laser gun at a wall in (a) an accelerating box deep in space and (b) a box dropped from a great 
height above the earth. 
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Thought experiment “Dropped” 


Now consider the other thought experiment of dropping the box with our friend in it from 
a great height above the earth (figure 1b). A dropped apple floats in space, and our friend 
is blissfully unaware of any gravitational field. He fires the laser gun, and sure enough, it 
hits the exact spot he aimed for. Why shouldn't it? There is no gravitational field around. 
A freely falling person does not know gravity! 

But the outside observer sees the entire box falling. The spot marked on the wall for 
target practice has dropped in the time it took the laser beam to get there. For the laser 
beam to hit the spot, it must have fallen exactly as much as the wall. To her, standing on 
earth, there is a gravitational field and light falls. 


Very nicely, both thought experiments reach the same conclusion. 


Gravitational redshift 


Thought experiment “Accelerated” 


Back in the box accelerating in deep space, our friend now fires his laser gun at the ceiling. 
(More than being quirky, he might have some personality disorder.) First, what does our 
all-seeing outside observer see? By the time the light gets up to the ceiling, since the box is 
accelerating, the detector attached to the ceiling is moving faster than when the shot was 
fired. Plain old Doppler effect tells us that the detector will see light of a lower frequency. 
Our friend, since he is convinced he is in some gravitational field, concludes that light 
redshifts as it “climbs” from a point of lower gravitational potential to a point of higher 
gravitational potential. In other words, the outside observer sees a Doppler redshift, while 
our friend the inside observer sees a gravitational redshift (figure 2a). 

Not only is the equivalence principle argument direct and convincing, it allows us to 
calculate the effect using literally freshman physics! Let the height of the ceiling from the 
floor be h. So light took time At = h/c to get from the floor to the ceiling, by which time 
the ceiling is moving with speed v = gAt = gh/c ina frame in which it was at rest at time 
t — At. For gh/c <c, we don’t even need the fancy relativistic Doppler result derived in 
chapter I.3, merely the elementary Doppler shift result Aw/w@ = —v/c = —(gh/c)/c. The 
situation gh/c < c corresponds to a weak gravitational potential @ = gh (recall that ® has 
been defined as the potential energy per unit mass ever since part I). We thus conclude 
that Aw/@ = —(gh/c)/c = —(Bceiling “7, Pp oor) /C?. 

The equivalence principle tells us that this holds in any weak gravitational field: 


(Sesh = “nie _ (Paste _ “anit (1) 
a 2 


Wemitter c 


To check the sign in (1), remember that Ppigher > Plower Where Ppigher is the potential at 
the tree branch where the apple was hanging and ®),y,; is the potential at Newton’s head. 
Note that the terms “higher” and “lower” are in accordance with everyday usage. Thus, if 
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detector 


WZ 


i 


} | a) " 


we 


Figure 2 Firing a laser gun at the ceiling in (a) an accelerating box deep in space and (b) a box dropped from a 
great height above the earth. 


the emitter is on the floor and the receiver is on the ceiling, the right hand side in (1) is 
negative, and the frequency received is lowered toward the red. 


Thought experiment “Dropped” 


What about the other thought experiment, in which the box with our friend inside is 
dropped from a great height? (See figure 2b.) He invokes the daydreaming clerk’s happy 
thought, that a falling person does not feel gravity. Indeed, he is happily floating, in the 
idealized freely moving inertial frame that elementary physics is described in. So, of course, 
the light detector in the ceiling registers the same frequency: why would the light change 
its frequency propagating in an inertial frame in the absence of the gravitational field? 
In fact, being an accomplished experimentalist, he rigs up the detector to flash a signal 
indicating “Yes, same frequency!” 

The observer standing on earth sees the signal and is puzzled. She observes the detector 
rushing down toward the light and so should see a Doppler blue shift and register a higher 
frequency. Being an insightful theorist, she eventually suspects that there must be another 
effect that cancels the Doppler blue shift: the earth’s gravitational field must shift the light’s 
frequency toward the red by precisely the same amount as the Doppler shift. 


Instructively, in these thought experiments, although the two observers disagree on what 
is going on, they come to the same conclusion: when a photon “climbs up” a gravitational 
potential, its frequency redshifts. 
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Gravity affects the flow of time 


Actually, we already had a hint of gravitational redshift back in chapter IV.3 when the bright 
young physicist tried to incorporate the physical notion of a potential into the relativistic 
action for a point particle. In option G, the action is modified to 


S= / J+ 20(x))dt2 — dx? 


Let me remind you that, in particular, for a particle at a fixed position, its proper time in- 
terval is given by dt = JA+26(@))d tx (14+ (x))dt. Gravity affects the flow of time. 
The change in this flow of time translates into a change in frequency: frequency effec- 
tively depends on where you are according to w(x) « 1/(1+ ®(x)) ~ 1— ®(%). In other 
words, oe oo ae ~ 1— ®yecciver + Pemitter: Restoring a factor of c* by dimensional 
analysis, we recover precisely (1). 

Incidentally, this also resolves an apparent puzzle. When you first heard about the 
gravitational redshift, you might have wondered how counting the number of waves that 
pass by per unit time could be affected by gravity. The answer is that gravity affects the 


running of the clock used to define “unit time.” 


The Smart Experimentalist 


As we will discuss in chapter VI.3, the deflection of light was observed soon after Einstein 
proposed his theory of gravity in 1915. In contrast, almost 50 years had to pass before 
gravitational redshift was verified. While the two effects are conceptually almost equally 
easy to understand, one effect challenges the experimentalist much more severely. 

For the sun, ®(surface of sun)/c? = GM/(Rc?) ~ 107°. How do you disentangle this 
tiny frequency shift of one part in a million from the standard Doppler shift due to the 
thermal motion of the emitting atom on the solar surface? With GM/ (Rc?) ~ 107? for 
the earth, terrestrial experiments seemed even more out of reach until the discovery in 
1958 of the Mossbauer effect. Normally, emission lines from an atom in a crystal are 
broadened by recoil and by the interaction of that particular atom with its neighbors, all 
in thermal agitation. Mossbauer discovered (while a graduate student) that under certain 
circumstances, the atoms are all locked together so that the recoil is transferred to the 
crystal as a whole and the emission lines are much sharpened. In a famous experiment 
performed in a tower* of the Harvard physics building, Pound and Rebka in 1960 exploited 
the Méssbauer effect to verify the gravitational redshift. 

Here comes our friend the Smart Experimentalist. “The tower is only about h = 20 
meters high, and so the effect is only (GM/(Rc*))(h/R) ~ 10~}! How would one of you 


* The tower was built as part of a deal to recruit Edwin Hall (1855-1938) from Johns Hopkins University, 
where he had discovered the effect bearing his name. Hall believed that a similar effect might exist for gravity, 
hence the tower. 
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theorists do the experiment?” Think for a moment, particularly if you are a theorist. The 
answer is in the appendix. 


The power of the equivalence principle 


Let us appreciate the far-reaching power of the equivalence principle. Suppose we have 
mastered the physics of a certain class of phenomena in the absence of gravity. In other 
words, we understand the physics in flat Minkowskian spacetime. It doesn’t matter what 
kind of physics; it could be the physics of quarks interacting with gluons, for example. 
Thanks to the equivalence principle, we know immediately the corresponding physics 
in the presence of gravity. All we have to do is to go to an accelerating frame. But this 
amounts to a change of coordinates, and we know how to do that in general. As we saw in 
the preceding chapter, to write down the physics in the presence of gravity, all we have to 
do is replace flat spacetime with curved spacetime. 

In the simplest case of a point particle, we merely have to replace the Minkowski metric 
Nyy in the action S = —m [ \/—n,,dx"dx” by the general metric g,,,(«). There, we have 
it without doing any work! The action for a particle moving in a gravitational field is 


S=-m / af ~8pv(~)dxtdx? (2) 


Well, well, it is precisely option G in chapter IV.3. 
Later in chapters V.4 and V.6, you will see more examples of the equivalence principle 
in action. 


A matter of words: Gravitational field “versus” curved spacetime 


We now understand that the gravitational field is a manifestation of curved spacetime, or 
perhaps more accurately, that the gravitational field and curved spacetime are effectively 
the same thing. In the parable given in chapter V.1, the Bering Strait does not exert a 
mysterious force ona plane flying from Los Angeles to Taipei. Rather, the plane is following 
the curvature of the earth. We could say that there is no such thing as gravity, only curved 
spacetime. 

But you could say with equal justification that spacetime does not exist; there is only 
the gravitational field. To me, it is just a matter of words, and the only relevant issue is 
which language you find more useful to think in. Some authors? like to make dramatic 
pronouncements, something along the following: space has disappeared, time has dis- 
appeared, spacetime has disappeared! Yes, indeed, with a gravitational field and hence 
the geodesics of test particles, you can determine where and when without “referring” to 
an underlying spacetime. Einstein himself, in his more philosophical moments, adopted 
this point of view, writing in 1916 that “the requirement of general covariance takes away 
from space and time the last remnant of physical objectivity.” In other words, there is 
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no spacetime, only a bunch of fields interacting with one another,* with the gravitational 
first among equals. As a quantum field theorist, this picture appeals to me also, with the 
gravitational field providing an arena for the other fields to play in. 

However, I think that most physicists, myself included, find it more natural to think 
of particles and fields moving in a curved spacetime seeking the best action deal for 
themselves. But as I said, itis merely a matter of words, and in the end, it is the equations 
that matter. 


A misconception 


I conclude by mentioning a common misconception. Some textbooks state that Einstein 
gravity is based on the principle of general covariance. If we interpret this principle as 
stating that we are free to choose whatever coordinate system we like, then this statement 
by itself is empty of content, or misleading at best. We've always had that freedom, even 
in Newtonian physics; perhaps we have forgotten, because we usually choose coordinates 
that make the equations look the simplest. 

The correct statement is that Einstein gravity is based on the equivalence principle, 
which relates two physical situations, one with a gravitational field and one without. This 
is quite different from a symmetry, such as rotation or special relativity, which tells us that, 
under rotation or Lorentz transformation, respectively, physical laws are left invariant. I 
cannot emphasize this point enough. 

More precisely, we note that for each of the effects described in this chapter, the fund- 
ing agency was generous enough to support two different experiments, one filed under 
“Accelerated,” the other called “Dropped.” For each of these experiments, two principal 
investigators are involved: an “inside” observer and an “outside” observer. Consider again 
the gravitational redshift. In the Accelerated experiment, we could rig up the detector on 
the ceiling to email the frequency reading to the two observers. No doubt about it, both 
observers write down that the frequency has shifted toward the red. 

Confusio: “I see that the confusion some students have may have stemmed from an 
inherent sloppiness in the English language. You wrote that the outside observer sees a 
Doppler redshift, while the inside observer sees a gravitational redshift. What do you mean 
by the word ‘sees’?” 

Confusio is absolutely right. At the cost of being more wordy, we should have said that 
the outside observer thinks that he sees a Doppler redshift, while the inside observer 
thinks that he sees a gravitational redshift, even though they agree on the actual amount 
of frequency shift. 

The equivalence principle is not a statement that “physics” does not depend on the 
observer (as in the corresponding statements for rotation invariance or special relativity). 
It does! One observer sees a gravitational field, the other does not. But now the Talmudic 
quibble could be over what we mean by the word “physics.” The two observers see the same 


* Quantum field theory teaches us that particles are ultimately manifestations of fields. 
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frequency shift: they receive the same email from the detector. But they offer different 
theoretical interpretations for the same redshift, as due to acceleration for the outside 
observer, and to gravity for the inside observer. 

Similarly, in the Dropped experiment, both observers are told by the detector that there 
isn’t any frequency shift. But while the inside observer thinks that he is freely moving, 
the outside observer publishes an explanation that the gravitational redshift has canceled 
a Doppler blue shift. 


Appendix: The “meaning” of gravity? 


Okay, did you figure how those smart experimentalists did it way back when>* The clever idea is to move the source 
(or the receiver) up and down at precisely calibrated speeds, thus exploiting the Doppler redshift to alternately 
add to or subtract from the gravitational redshift. 

The gravitational redshift was later verified to a much higher degree of accuracy by launching a rocket carrying 
a maser to a height of 10* km, comparable to the radius of the earth. Experimental physicists like to say that 
yesterday’s discovery is today’s calibration and tomorrow’s background. Nowadays, the gravitational redshift 
appears as a necessary correction term in the global positioning system (GPS) that many use routinely without 
giving it a second thought.’ General relativity has entered into everyday life. 

Incidentally, the 19th century tower was not equipped with an elevator. Pound and his assistants had to carry 
the heavy equipment up the narrow tower in mountaineering backpacks. Years later, I heard Pound joke that by 
performing this experiment, he had truly learned the meaning of gravity. 


Exercise 


1 Ahigh precision atomic clock is carried on a plane flying eastward around the world. Calculate the fractional 
time shift between the time elapsed as measured by this clock versus a clock kept on the ground (and versus 
a clock on a plane flying westward around the world). (The experiment has been carried out.) Hint: Both 
special and general relativity come in. Do the calculation for the idealized case (of course): plane flying along 
equator at a constant altitude, earth’s spin axis perpendicular to the equatorial plane, and so on. 


Notes 


1. My wife and I bought our son Max a few balls when he was about 1 year old. I expected that he would learn 
about Newtonian, or at least Aristotelian, mechanics, but no, he graduated right off to Einsteinian mechanics. 
The newly installed wood floor in our house, while level and flat to the eye, is in fact not: the balls follow 
noticeably curved paths. See also Toy/Universe, figure 2.4 on p. 26. 

2. For example, Toy/Universe. 

3. For example, C. Rovelli, in Quantum Gravity. 

4. C. W. Chou et al., Science 329 (2010), p. 1630. 


* The technology of clocks has improved so much in 50 years that now one merely has to raise one clock 
by 33 centimeters relative to another to detect* a difference. Incidentally, with the same technology, time 
dilation in special relativity has been measured down to speeds of about 35 kilometers per hour. 

T And let’s not forget that the laser and the sensor used in the system are based, respectively, on the 
concept of stimulated emission set forth in Einstein’s 1917 paper on radiation and on the photoelectric 
effect explained in his 1905 Nobel Prize-winning paper. 


V3 The Universe as a Curved Spacetime 


From curved space to curved spacetime 


I set up this book so that it is now a cinch for you to jump from curved space to curved 
spacetime. Given that you have played around with curved spaces, understood Minkowski’ 
geometry of spacetime, and heard what Professor Flat said about local flatness, you are 
more than ready to play with curved spacetimes. 

You understood Professor Flat’s explanation that in a small enough region around any 
given point, it is always possible to go to locally flat coordinates, a definition of Riemannian 
manifold if you like. Easy: a simple matter of diagonalizing a matrix and counting the 
degrees of freedom we have in changing coordinates to cancel off the linear deviations 
from flatness. 

Well, we can jump from curved spaces to curved spacetimes immediately, if by flat 
we now mean Minkowskian flat, rather than Euclidean flat. The entire discussion in 
chapter I.5 can be taken over; you merely have to mentally replace 5, by nv. 

A d=(D+1)-dimensional curved spacetime, with D spatial dimensions and one 
temporal dimension, is defined by 


ds? = g,,,(x)dx"dx” (1) 


with w,v=0,1,---, D taking on d = D + 1 values, such that at any given point x, we 
can transform the coordinates so that g,,,,(x,,) becomes n,,,, (with, as usual, no9 = —1, and 
ni =+1,i=1,---, D,andn;; =0 otherwise). We normally take D = 3 for the spacetime 
we are living in. 

My pedagogical strategy is to let you play around with a couple of examples of curved 
spacetime instead of giving you a long formal exposition. 
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The simple spacetime we quite likely live in 


Our first example is the spacetime described by 
ds* = —dt? +. a(t)*dx? = —dt? + a(t)?(dr? + r2dQ?) (2) 


where dQ? = d6? + sin? dy? represents the usual spherical coordinates. At any given time 
t, the spatial geometry dS* = a(t)?dx? is just the familiar flat Euclidean space E? we deal 
with every day, homogeneous (that is, every point is just as good as every other point) and 
isotropic (every direction is as good as any other direction). 

The proper distance between two points (t, x) and (t, x + dx) at the same time but 
separated infinitesimally by dx is then given by a(t)|dx|. Thus, with a(t) (known as the 
scale factor of the universe) some function of t, this spacetime could describe an expanding 
or contracting universe. Eventually, we will learn about the dynamics driving the time 
evolution of a(t), but for the moment, we are just going to describe and explore this 
particular spacetime with a given a(t). Not only is (2), which I will refer to as the expanding 
universe* for short, the simplest curved spacetime I know of, but it is also quite likely the 
spacetime we live in. 

For phenomena with a characteristic time scale small compared to the time scale on 
which a(t) varies, the spacetime is effectively locally Minkowskian. Almost trivially, at any 
point in spacetime, we simply define y' by dy! = a(t,)dx', and we have g,,, (Vx) = Ny after 
changing coordinates from (t, x) to (f, y). 

A historical note. Metrics akin to (2), now almost universally and erroneously attributed 
to Willem de Sitter (1872-1934), were first written down in 1925 by Georges Lemaitre 
(1894-1966). In 1917, de Sitter wrote down a metric that was not homogeneous in space, 
an error corrected later by Kornel Lanczos (1893-1974) and Hermann Weyl (1885-1955) 
independently. It has been said! that “Lanczos had the key to an expanding universe in 
his hands, but he did not unlock the door.” Let that be a lesson to the reader! Lemaitre 
arrived at the metric? (2) independently, but in contrast to Lanczos and Weyl, understood 
the physics of the expanding universe. (Hence the quote about Lanczos.) 


Motion in curved spacetime 


Life is easy. Since you already know how to determine geodesics in curved space, you 
know how to determine geodesics in curved spacetime and hence the paths followed 
by free particles. Indeed, I even used the notation g,,, in chapter II.2. Hence, as was 
already discussed in the preceding chapter, you could lift the geodesic equation Lae 


dt? 
ne a = 0 in its entirety from chapter II.2 and use it in curved spacetime. As was 


* A common misconception? is that cosmic expansion causes all distances to scale up, which is manifestly 
false. For example, locally, the earth and the moon are bound to each other and the distance between them is not 
expanding. 
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explained in chapter II.2, it is actually simpler to extremize the action 


san for| (Boor) ; 


directly. We obtain 


at dix \? 

—+a(t)a(t)| —] =0 4 

°P) a(t)a(t) (4) (4) 
and 

d ,dx d?x — 2a(t) dt dx 

— (a(t)?— ) =053 — ——=0, 5 

ae (aw *) ap AG) dt dt (>) 


plus of course 


dt\? ye ae 
ea) = ° 


Except for a sign here and there, everything is the same as in chapter II.2. 

As before, we are to solve two out of three equations. Let us choose (5) and (6). When I 
was an undergrad, I hada professor who liked to say that some equations are so simple that 
you only need your eyeballs, not your brain, to solve them. Speaking more academically, 
we can solve (5) and (6) immediately by inspection. 

Lines of constant x with t = tf are geodesics, and thus, in this universe, lazily going along 
with the flow is your best bet for extremizing the action. More seriously, on cosmological 
scales with galaxies treated as points, we could label individual galaxies appropriately and 
use the labels as the x coordinates. A coordinate system based on a network of geodesics 
is known as comoving coordinates. (Recall that we already met comoving observers in our 
discussion of perfect fluids in chapter III.6.) See the appendix. 

For future use, we read off the Christoffel symbols from (4) and (5): 


I), = aas Mj =P yo = 5) (7) 


ij? j j 


with all other components vanishing. 


Definition of spatial distance in a general curved spacetime 


Many things are said by many physics texts to be obvious. My self-proclaimed goal is 
to try not to say something is obvious unless it really is obvious by some community 
standard. In a previous section, I said that the proper distance between two points sep- 
arated infinitesimally by dx is given by a(t)|dx|. That was pretty obvious, but perhaps we 
ought to be more careful. After all, with spacetime warped this way or that, our intuition 
might fail. 

Instead of the simple spacetime in (2), we now consider the general spacetime described 
by ds? = Suy(x)dxdx” = Bqo(x)dt* + 2g;(x)dtdx! + gij(x)dx'dx/. To keep things spe- 
cific and focused, imagine that we are sitting at a point with spatial coordinates x' + dx’ 
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? dt, tp = te + dt, 
4 dt, 
te ko o + | dt,-dt. 
Se dt_ 
SS dt. 
ts = te + dt_ 
> > — 
x x+dx 
dx’ 


Figure 1 The operational definition of the 
distance to a nearby point involves sending 
a light beam to that point and waiting for it to 
bounce back to us. 


and our friend is sitting at a nearby point with coordinates x’. Intuitively, you might feel 
that the distance d/ between us and our friend is given by dl* ~ g;,dx‘dx/. But with go; £0, 
our intuition may be a bit shaky. Besides, what role does gog 4 —1 play? 

The Smart Experimentalist pops up again and says, “Distance is not some fancy theoreti- 
cal construct but the result of an actual measurement.” She instructs us that the operational 
definition of distance involves sending a light beam to our friend and waiting for it to 
bounce back to us. What else could we do? 

Let’s set up the situation with a bit of care (figure 1). Denote by t the coordinate time 
when our friend received the signal, by ts = t- + dt_ the coordinate time when we sent 
the signal, and by tg = tp + dt, the coordinate time when we received the return signal. 
(The peculiar notation dt, will become clear soon.) The event of our friend receiving 


the signal has spacetime coordinates (t;, x'). The two events, sending the signal and 
receiving the return signal, have spacetime coordinates (t¢ + dt_, x! + dx') and (tp + 
dt,, x! +dx'), respectively. Thus, these two spacetime events have coordinates differing 
by (tp + dt,, x! + dx!) — (tp + dt_, x' + dx!) = (dt, — dt_, 0), and hence the elapsed 
proper time (defined as usual by dt? = —ds?) between the two events is given by dt = 
»/—800(dt, — dt_). Note that our, and our friend’s, spatial coordinates have not changed: 
that’s what the word “sitting” means. The distance d/ between us and our friend is defined 
to be dt divided by 2c, in other words, the elapsed proper time we experienced (not some 
unphysical coordinate time!) divided by light speed and a factor of 2 (to account for the 
round trip our light signal took). 

Indeed, we measure the distance between the earth and the moon, or for that matter, 
between an aircraft and a control tower, in precisely this way using radar ranging. And 
of course these days anybody using the global positioning system (GPS) is, knowingly 
or unknowingly, exploiting this method to pinpoint locations, as already mentioned in 
chapter V.2. 
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We now also understand what “nearby” means: the spacetime interval during which the 
bouncing of light (or radar, laser, whatever) takes place has to be sufficiently small so that 
we can neglect any variation in the metric. Treating g,,,,(x) as constants will really simplify 
the calculation. Since ds = 0 for light, the path traced by our light ray satisfies 


0 = goo(x)dt* + 2goi(x)dtdx! + g;j(x)dx'dx! (8) 


This quadratic equation for dt determines how much coordinate time has elapsed when 
light traverses dx’. Solving, we obtain the two roots 


1 F : ae 
i= — (—eoits' + J (ordx'? = song) 
&00 


Hence the square of the operationally defined distance dl is given by* 


80:80; 
&00 


dl? = —go9 (Fin, - 7) = (« - ) dxidxi (9) 

We see the role played by go9 and the off-diagonal components of the metric go;. As 
indicated, they correct the naive answer dl? ~ g;;dx'dx/. Fortunately, gg; vanishes in most 
of the spacetimes we will look at (see, however, exercise 2), including various universes 
(17) to be explored later in this chapter and the Schwarzschild spacetime to be studied 
in the next chapter. In particular, for the expanding universe (2), our naive supposition 
dl? = (a(t)dx)* is indeed correct. Later, in chapter VII.5, when we study rotating black 
holes, we will encounter a spacetime with go; 40 and go9 A —1. 


Distances in an expanding universe 


The result (9) holds only for nearby observers. To calculate the distance between two distant 
observers, we will in general have to integrate along lightlike geodesics. For (2), the metric 
is so simple that we can do this radar ranging calculation explicitly. 

Since space is completely homogeneous and isotropic, we can, with no loss of generality, 
suppose that we are sitting at the origin r = 0 and our friend is living in a distant galaxy at 
r = R. We saw earlier that an inertial observer can stay at rest at a fixed value of x. Indeed, 
this follows from a symmetry argument, since there is no special direction for the inertial 
observer to move in. Since space is homogeneous and isotropic, why should this observer 
move to x + AX any more than to x — Ax? 

Let us determine the distance to our friend. Send a signal at time t; and wait for the 
return signal at tp. Denote by t(R; ts) the time when our friend receives the signal. Note 
that, in an expanding universe, this time depends not only on R but also on when we send 
the signal. Setting ds = 0 in (2) for light travel and taking the root dt = a(t)dr, we have for 
the outbound trip from (fs, 0) to (¢(R; ts), R) the relation 


R t(R3ts) 
R = dr =| a (10) 
0 ts a(t) 
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The second equality determines 1(R; ts). For the return trip from (t(R; ts), R) back to 
(tp, 0), we take the other root and obtain — / 4 dr= he $35 ran Adding, we obtain 2R = 
tr dt 


& BG WE define the distance between us and our friend by 


D(R; ts) = 3 (te — ts) (11) 


Communication in an expanding universe 


The scale factor a(t) of the universe* will have to be determined by observation and by 
theory (which we will get to in chapter VI.3). Until a decade or so ago, it was thought 
that the universe was expanding like a power law a(t) « t%. But as you may have heard, 
observational evidence now suggests that our universe is well described by (10) with 
a(t) = e' for some constant H, known as the Hubble parameter. (Note that we have set 
the origin of time.) 

If somebody gives us a(t), then we can evaluate t(R; ts) and D(R; ts) using (10) and (11), 
respectively. To illustrate what is going on, I will do the exponential case and let you do 
the power law case as an exercise. Integrating (10), we obtain R = (e's — e~ Ht(Rits)), 
Evidently, we should measure distance and time in units of the Hubble distance or Hubble 
time Ry = H~'; that is, we could set H = 1 to lessen clutter. Without pausing to solve 
R=e7's — e~' its), we can see that as R increases, t(R; ts) increases. This makes sense: 
the farther away our friend is, the later she will receive our signal. 

But now comes an interesting observation: there exists a Rinax = € ‘5 at which t(R; ts) 
reaches infinity. Thus, if our friend is located farther away from us than that, she will never 
receive our signal. In other words, if R > Rypax, our signal will not be able to catch up with 
her. The universe, expanding too fast for light to keep up, is said to have an event horizon, 
known as the de Sitter horizon. The terminology is in analogy with the fact that we cannot 
see beyond the (everyday) horizon. Note that the later we send the signal, the smaller is 
Rmmax- With the passage of time, the horizon closes in on us. 

Next, solving 2R = e's — e~'®, we obtain the distance between us and our friend 


D(R; ts) = 3 (te — ts) = —3 log(1 — 2Re's) oe 


Note that the distance depends on fg, as it should since the universe is changing. For R « 1 
(that is, for nearby objects), we recover D(R; ts) ~ Re’S, as we would expect from (9). In 
contrast, for far away objects, as Re'S > 5 from below, D(R; ts) > oo. In other words, 
in the regime Ryay > R > 5 Rimaxs we never hear back from our friend, even though she 
receives our message. This makes sense: between the time we sent our message and the 
time our friend sends her response, the universe has expanded some mote. Light can get 
from us to her, but won’t make it back from her to us. 


* We will discuss cosmology in more detail later. Clearly, the metric (2) is applicable to the universe only on 
cosmic distance scales, over which the universe appears to be homogeneous and isotropic. 
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Figure 2 In an expanding universe, the light cones get ever narrower and 
sharper as time goes on. 


Referring back to (10), we see that the existence of an event horizon hinges on the 
integral [~ AS being finite: an exponentially growing a(t) certainly renders the integral 


convergent. 


Observers in an exponentially expanding universe do not move faster 
than the speed of light 


Confusio: “Doesn't that mean that for R > Ryax, our friend is moving away faster than the 
speed of light, like* Ms. Bright? Doesn’t that violate special relativity?” 

No, all that special relativity requires is that the worldline of physical particles lies inside 
the local light cone. In other words, physical particles only have to compare themselves 
with the light rays around them. At any given point, the trajectories of outgoing light rays 
in the radial direction are defined by dt = a(t)dr = e"'dr: the light cones are getting ever 
narrower and sharper as time goes on (figure 2). Nevertheless, the worldlines of physical 
particles, and of our friend in particular, stay within the light cone. General relativity cannot 
possibly violate special relativity; after all, one is built on the other. 

In a universe with a(t) = e””’, galaxies will pass out of our horizon one after another. 
Eventually, we will be left all alone, like unfortunates marooned on a desert island watching 
the pirate ship about to pass over the horizon. 


* You've probably read about her. No? Here is her story:° There was a young lady named Bright / Whose speed 
was far faster than light / She went out one day / In a relative way / And returned on the previous night. 
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Cosmological redshift 


Suppose that a proper time interval Ary after we sent our initial pulse to our friend, we 
send another pulse. To lessen clutter, write T for t(R; ts), the time she receives the initial 
pulse. For her, the proper time interval Atp between the two pulses follows immediately 
from (10): R= sia ay = aires a. which for a time interval much smaller than the 
characteristic time scale of a(t) is given by ah — wy . We can translate this immediately 


into a frequency shift for an electromagnetic wave emitted with frequency o,: 


we _ aT) 
O, - a(ts) 


(13) 


where we have defined the redshift factor z used by astronomers. In an expanding universe, 
a(T) > a(ts) and so z > 0, corresponding to a redshift. The frequency w, at receipt is less 
than the frequency at emission «,. 

I must emphasize that, in this chapter, for pedagogical clarity I have often specialized to 
the case a(t) = e”' to illustrate the point being made, but evidently our discussion often 
holds for other forms of a(t). For example, the derivation of the redshift formula (13) does 
not depend at all on the assumed form of a(t). As I said, you will eventually learn how to 
determine a(t) given the content of the universe. 


Light rays at 45° 


It is rather inconvenient to have the shape of the light cone depend on where we are. 
Now that we have built up some intuition about Minkowski spacetime, we would prefer 
radial light rays to always travel along the 45° lines. You can readily accomplish this by a 
coordinate change. I will take you through the rather simple steps involved. Change t to 7 
and require ds* = dt? — a(t)*dx? = b(n)*(dn* — dx”) with some unknown function b(n). 
The coefficient of dx? tells us that b(n) = a(t), which allows us to determine 7 in terms of 
t by the relation dyn = Ks = Hoe 

The variable 7 is sometimes known as cosmic time. At a given cosmic time 7, spacetime 


is Minkowskian, and light propagates along 45° lines dy = +|dx| in the (n, x) plane. 

For any given a(t), we can work everything out explicitly. For example, suppose a(t) = 
(t/T)2, with T some characteristic time scale, which as we will see in chapter VIII.1, 
may have been the case in the early universe. Then n = (Tt)? (an irrelevant integration 
constant has been absorbed into 7) and 


2 
a= (+) (—dn? + dx?) (14) 


For a(t) = e”", the variable n = —e~"'/H increases from —oo toward 0 as t goes from 
—oo to +00. Note that our sign choices are such that dt and dy have the same sign and 
that 7 < 0. We obtain 


1 


ds? = 
(Hn)? 


(—dn? + dx’) (15) 
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Closed and open universes 


“Sandage, can you really envisage curved space and the beauties 
of Riemannian geometry, so necessary for relativity?” | replied, 
“No, Father, | have tried and tried, using all the tricks known to 
visualize curved space, but my visualizations have so far failed.” 
Lemaftre then sighed and said, “I understand, but it is a pity 
because the visualization is so beautiful. Perhaps it might be 
best for you to change fields.” He said it gently, like a father to 
a son.© 


—the distinguished cosmologist Allan Sandage, 
recalling his encounter with Georges Lemaitre 
in 1961 in Santa Barbara, California 


Whew! Perhaps we should change fields. Well, the only way to gain some intuitive feel 
for curved spacetimes is to be exposed to several examples, and that’s what we start to 
do here. Incidentally, Lemaitre was a Catholic priest, hence the form of address used by 
Sandage. 

After Lemaitre proposed (2) in 1925, he was unhappy about the Euclidean character of 
space. Spacetime is curved, but space is flat. Then in 1927, he managed to write down the 
modern form with space compactified into a 3-sphere S?. It should not take you 2 years, 
however, given that you were exposed to S? back in chapter I.6. Simply replace the metric 
(dr? + r7dQ?) for Euclidean 3-space E? in (2) by the metric 


1 
= dr? + r°dQ? 
Se 


for S?. Indeed, from chapter I.6, you even know how to write down the metric for the 


hyperbolic space H? (1.6.15). Thus, we can combine these three cases into 


ds? = —dt? + a(t)? | ar? + a? (16) 
1- kn 


with the integer k = 1, 0, or —1 corresponding to a (spatially) closed, flat, or open universe, 


respectively. Much of the discussion in this chapter can then be repeated for the k = +1 
cases, which I will leave to you in the exercises. 

As we will see in part VIII, the universes described by (16), commonly called Friedmann- 
Robertson-Walker universes, form the basis of modern cosmology. More correctly, they 
should be called Friedmann-Lemaitre-Robertson- Walker, or perhaps simply Friedmann- 
Lemaitre universes, since the work of Robertson and of Walker’ was considerably later, in 
the 1930s. 
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You might have noticed that we can absorb* the length scale L into r. Actually, some 


authors do precisely that and write ds? = dt? — a(t)?(—1dr? + r2dQ), so that r is 
Pp y 1—kr 


dimensionless. But I think this confuses some people, because it now seems that k = +1, 0 


are three discrete possibilities. One often hears the assertion that “the universe is now 
known to be flat” as if it were an absolute statement.’ In fact, physical observation can 
only give us a (possibly very large) lower bound on L. For this reason, the reader will see 


me often laboriously dragging L around when I could have chucked it. 
dr 


/ 2 
ks 


(16). Integrating, we have r = sin w (closed), r = y (flat), r = sinh y (open) for the three 


Itis often convenient to transform coordinates by setting Ly = L [ dw = [ in 


cases. Thus, the closed, flat, and open universe are described by 


ds? = —dt? + L’a(t)*(dw? + sin* wd’) (closed) (17) 

ds? = —dt? + L’a(t)*(dw? + wd?) (flat) (18) 
and 

ds* = —dt* + L?a(t)*(dw? + sinh” wdQ?) (open) (19) 


The reader may recall from chapter 1.6 that the spatial section for the closed and open 
universe describes the 3-dimensional sphere S* and hyperbolic space H?, respectively. 


Proper distances in cosmology and a “cosmic conspiracy” 
In the literature, people often invite themselves to define, perhaps a bit sloppily, a “proper 


distance” d(t, R) between two distant points (t, 0, 9, y) and (t, R, 6, y) by integrating the 
length segment d/ = \/g,,(t, r)dr derived in (9): 


d(t, R) =a(t) i el ee (20) 
0 as 
But we just did this integral. Thus, we have 
d(t, R)=a(t)Lsin-'(R/L) (closed) 
d(t, R) =a(t)R (flat) 
d(t, R)=a(t)L sinh (R/L) (open) (21) 


As expected, for small R, d(t, R) ~ a(t)R. For the closed universe, d(t, R) is only defined 
for R< L. 

You might wonder why d(t, R) is so proper: as we have seen in the derivation of (9), 
a(t)|Ax| is the actual operational distance between two neighboring points separated 


* For a more careful treatment, see chapter X.1. 

T Next time you hear this statement, you will know to ask how a physical measurement could possibly give 
an exact result without any error bars. Typically, the speaker will mumble something about k being a discrete 
variable that can only take on three integer values k = 1, 0, and —1. 
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by Ax only if the separation is infinitesimal. Nevertheless, d(t, R) is useful because for 
R < L, itagrees with the variety of distance measures* used by observational cosmologists. 

As emphasized by Weinberg,® for example, to interpret d(t, R) as a physical distance 
requires what he called a cosmic conspiracy. We need to line up, between r =O andr = R, 
many comoving observers, each separated from the next by some infinitesimal dr. At some 
agreed-upon cosmic time t, they would bounce light off a nearby observer to measure the 
proper distance ,/g,,(t, r)dr to that proximate neighbor and report the result to some 
central authority, who sums up ,/g,,(f, r)dr and then multiplies by a(t) to form d(t, R) 
as per (20). 


Appendix: Comoving coordinates 


The reader can skip this appendix on comoving coordinates upon a first reading. 

Before we can describe what they are, our friend the Jargon Guy pops up and says that comoving coordinates 
are also known as Gaussian normal. Thank you, but let’s give a physical rather than mathematical description. 
As mentioned in the text, a collection of freely falling particles (such as galaxies in the context of cosmology, or 
the dust in a collapsing cloud on its way to form a star or a black hole) naturally provides a set of coordinates that 
makes particularly good sense to physicists. We use some suitable labels on the particles as the X coordinates. 
For t, we use the proper time experienced by the particle. In other words, imagine each particle carrying a clock, 
the reading on which we take to be t. 

This last statement implies that dt? = —g,,,dx"dx” evaluated for dx = 0 is equal to di”, that is, — gggdt? = dt?. 
Thus, in comoving coordinates, go = —1. 


That the particles are freely moving means that constant x corresponds to geodesics, with t = t. Setting 
i dx dx” 
a pv dt dt 
definition of Tr“, implies g'/0,g9; = 0. For the sort of spacetimes we will deign to consider, g;; is nonsingular, so 


~ : < é, 25 ry . i + 
X constant in the geodesic equation we +T = 0, we learn immediately that I’), = 0, which by the 


we can conclude that 
4,80; = 0 (22) 


What we would like to get is g(t, X) = 0. Notice that (22) tells us that if we could fulfill our heart’s desire at one 
particular instant in time ¢, we have it for all time. 
By now you know the trick we have at our disposal: find a coordinate transformation to get rid of go;. 
We are free to go around resetting the clocks on each particle, namely t = t’ — f(x’) and x =x’, so that 
, ax! ax” ; of 
807 = Suv 9x9 9x7 = 80j — OxT° 
The trouble is that to get g), = 0, we have three (in general, d — 1) equations for one unknown function f. 


So we can’t do it in general. But we can do it in two cases. 


1. Spherical symmetry. Since d@ and dy must appear in the combination d6? + sin” 9dy?, the components 
8o9 and go, vanish, and the three equations collapse to one: or = 8o,, with the solution f(r) = 
J" dr'go,(r’), possible because go; does not depend on t by (22). Dropping primes, we arrive at the 
comoving coordinates 


ds? = —dt* + B(t, r)dr* + C(t, r)(d0? + sin? dg?) (23) 


This metric, depending on two unknown functions oft andr, would be particularly suitable for studying 
the gravitational collapse of a spherical cloud of dust. 


2. Polar-like coordinates. At t = 0, around a specified particle, we can always go to locally flat coordinates 
y", We have go; (0, X) = Ns ay" 


5,0 pi 1=0- Suppose these coordinates have the following two properties: 


* For example, they define a luminosity distance by measuring the apparent brightness of standard candles. 
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(a) ae = 0 and (b) Cris = 0. Condition (a) says that the separation y“(0, ¥ + 6x) — y“(0, xX) x 
5x!) 1-0 between the specified particle and its neighbors is purely spatial, that is, it vanishes when 
pt = 0. Condition (b) says that the movement of the particle in spacetime is purely temporal. It then 


follows that go ;(0, x) = 0 and hence by (22), go; (t, X) = 0. We arrive at 
ds* = —dt” + g,,(t, X)dx'dx! (24) 


Note that in the text, for the universe, we have the additional requirements of homogeneity and isotropy, 
which restrict the spatial metric g;; further and license us to write (16). 

To help you understand conditions (a) and (b) better, let me reveal that the familiar polar coordinates 
and spherical coordinates are Gaussian normal coordinates for E? and E%, respectively. To see this, 
simply let t > r, x' > 6', and flip a sign to rewrite (24) as 


ds* =dr* + g,,(r,01,---,0?~ao'do/ (25) 


In going from spacetime to space, we see that “a collection of freely falling particles” gets translated 
into “a collection of straight lines,” the locally flat coordinates y“ into “Cartesian coordinates” centered 
at the point we are focusing on, with y" pointing in the radial direction and y’ in the angular direction. 
Condition (a) says that (Gr) 
coordinates to see for yourself that it all makes sense. We have replaced the arbitrary setting t = 0 on 
the clock in the spacetime discussion by r = a in the polar discussion to avoid the inconvenient fact 
that even at physicists’ level of rigor, polar coordinates are ill defined at the origin. 

Physicists have generally borrowed the principle of presumed innocence from the Anglo-American 
legal system. As implied in the introduction to this chapter, we will happily presume, unless proven 
otherwise, that what we know about space works for spacetime also (except for the obvious stuff due 
to the crucial flip of sign). Here is an interesting example of going the other way, using the physics of 
freely falling dust to illuminate something about space, something that presumably even Gauss had to 


or 


= 0, and condition (b) that (#) = 0. Draw a picture for polar 


work a little to figure out. 


Exercises 


1 Show that (5) and (6) imply (4) by differentiating (4? — a 


ney with respect to T. 


2 Consider ds* = —dt? — 2 sin xdtdx + cos* xdx? + dy? + dz”. Using (9), calculate di between two points 


separated by dx. Can you explain the result you obtain? 


3. Explore the behavior of D(R; t,) for two cases of power law a 
matter, and a = 7 by radiation (as we will see in chapter VI 


4 Evaluate the redshift formula for a universe with an expon 
power law a(t). 


5 Derive (17). 


6 Derive (21). 


(t) « t%: a = 2 for a universe dominated by cold 
IL.1). 


entially growing a(t) and for a universe with a 


7. Show that the relation between redshift and scale factor derived in the text for the flat universe holds just as 


well for curved universes. 


8 Extend the discussion in the text for k = 0 to the cases k = J 


E1. 
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Notes 


1. H. Nussbaumer and L. Bieri, Discovering the Expanding Universe, p. 196. 

2. For all historical remarks in this chapter, see H. Niissbaumer and L. Bieri, Discovering the Expanding Universe, 
particularly pp. 195-199. 

. See http://www.youtube.com/watch?v=5U1-OmAICpU. 

We encounter 299/(g09)* = —1/g09- 

. See Fearful, pp. xx and 68. 

H. Nussbaumer and L. Bieri, Discovering the Expanding Universe, p. xvii. 


NDU RW 


. talked to a few cosmologists while writing this book. They were unanimous that the terminology “Robertson- 
Walker universe” should be dropped. 


oo 


. S. Weinberg, Gravitation and Cosmology, p. 415. 


V.4 Motion in Curved Spacetime 


Presence of external forces 


I explained in chapter V.1 that we can, with no further work, simply lift the geodesic 
equation 


ad?xX* dX" dX” LA al) 
dt? HY dr dt 
from part II of this book and study motion in curved spacetime. Indeed, we did precisely 
that in the preceding chapter. Unfortunately, the frequent appearance of (1) leads some 
students to a misconception that material particles and observers are obliged to follow 
geodesics. To the contrary, as a young observer, you are certainly free to strap a rocket pack 
to your back and blast off in this direction or that. 

The geodesic equation (1) describes the motion of a particle in the absence of any other 
force besides gravity, as is clear from the derivation in chapter V.1. With another force 
present, the particle follows 


GRP) 25, AXE GX 
dt? WY dr dt 


f*(X) (2) 


In particular, in an electromagnetic field, the force is given by 


Pads. ryaxay 
m dt 


for a particle of charge e and mass m. Here F*, = g’“F.,,. (See chapter V.6 for further dis- 
cussion.) Note the glaring contrast between the gravitational force and the electromagnetic 
force: one is universal, the other not. In other words, the left hand side of (2) makes no 
reference to any properties of the particle, while the right hand side depends on the ratio 
e/m, which varies enormously from particle to particle. 

That Einstein could write down the equation of motion in the combined presence of a 
gravitational and electromagnetic field, rather than spend years looking for a more general 
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general relativity, testifies to the power of the equivalence principle. We simply promote 
(IV.1.16) to (2). Conversely, setting g,,, to 7,,, in (2), we revert back to (IV.1.16). 


What is a free particle? 


Physicists are fond of speaking of free particles: a particle moving in the absence of any 
external forces is said to be free. But in Einstein gravity or general relativity, the concept is 
subtly different. In Einstein gravity, a particle following a geodesic in spacetime described 
by (1) is said to be free. In other words, a free particle* is not acted upon by any external 
forces except for gravity. In Einstein’s theory, the gravitational field is equivalent to curved 
spacetime. 

Part of the confusion can be regarded as semantic. We could say that there is no gravity in 
Einstein gravity, only curved spacetime. Perhaps the best policy is to dispense with words 
and look at only the equations. If the motion of a particle satisfies (1), it is free. If the 
motion of a particle satisfies (2), it is not free. 

Let’s put it more dramatically to underline the point. As you fall, at ever increasing speed, 
into the unrelenting clutch ofa black hole, perhaps never again to be released, you are free. 
But if you turn on your rocket pack as you approach the horizon, and blast your way back 
out to infinity to live out the rest of your days, you are not free. While a layperson might 
find these statements paradoxical, you and I happily know that they originate in the patent 
clerk’s happy thought, that a freely falling particle does not feel gravity. 


Recovery of Newtonian motion in a gravitational field 


Einstein’s description of a particle moving in a gravitational field—that the particle is 
seeking the “best” possible path in spacetime—is at first sight strikingly different from 
Newton’s—that the particle is acted on by a force. One advantage of the action principle 
treatment given in part IV of this book is that it renders going from Newton to Einstein 
more natural and makes clear that Einstein’s description must reduce to Newton’s. 

Let’s recover Newton’s equation from Einstein’s equation. This is not entirely obvious, 


d*x* 


5-7 in terms of two 


since the geodesic equation (1) appears to give the 4-acceleration 
powers of the 4-velocity xe 
To recover the Newtonian limit, three conditions must be met: 


1. The particle moves slowly: ae << eS 


* As you know, the term “test particle” is often used to emphasize that not only is the particle small enough for 
its internal structure not to be relevant but it is also insignificant enough not to affect whatever is producing the 
external forces. In the case of Einstein gravity, when we say particle, we assume that the particle does not affect 
and modify the curved spacetime it is in. As you also know, this idealizing and simplifying assumption often 
does not hold in physically interesting situations, such as two black holes circling each other. See chapter X.4 for 
a first step away from (1). 
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2. The gravitational field is weak, so that the metric is almost Minkowskian: g,,,~ yy + hyv 


with / small in the sense that we can neglect terms quadratic in h. 
3. The gravitational field h,,,, does not depend on time. 
i 2yr 0\2 ; : 
Condition 1 means “*- + P(45)° ~ 0, while 2 and 3 imply P') ~ — ?,hoo, So that 
i . : 2y0 
Teo a —5 oho =0 and Py 5 d;hoo. The geodesic equanon (1) then reduces to cx ~0 
et Sadn Sty : i 2 ae 

(which implies that ax is aconstant) and x + 59: hoo( GC) ~ 0, which since X°=t~t 


2yi 3 S : . : : 
(because of (1)) becomes a © —5ajhoo. Thus, if we identify the gravitational potential 
. 5 2¥ > 
® by log = 20 = — 2M , we obtain Newton’s equation oF ~ —V@. As you see, the secret 


: ; é , : a bs Mh aX” , 
to Newton’s equation emerging is that in the “force term ix x, the time component 


dominates the space components. 
This derivation shows that far from a spherically symmetric mass distribution, the 
spacetime metric must be such that go9 > 1 — 26M | Notice also that our derivation does 


not depend on h;;, nor on ho;, as long as they are time independent. 


ij 
The result we just obtained, that go) ~ 1— aut in the Newtonian limit, is entirely 


consistent with option G in chapter IV.1. 


Gravitational redshift 


In chapter V.2, we derived gravitational redshift using the equivalence principle. Let us 
derive this result again using curved spacetime. For pedagogical clarity, we assume that the 
spacetime is static, in other words, that the metric does not depend on the time coordinate 
t (for example, the metric to be presented in the next section). 

Suppose an observer (call him the emitter) located at some fixed x; sends light signals at 
regular intervals, separated by At;, as measured by his proper time, of course, to another 
observer (call her the receiver) at some fixed x. Consider a particular light signal traveling 
from xX; to Xp following some trajectory (which, using the metric, we could determine, but 
which, as we will see presently, we don’t need to know). In a static spacetime, physics is 
invariant under translation in time. Thus, the next signal would travel (see figure 1) by the 
same trajectory simply displaced by coordinate time Atg = Atg//—goo(¥z)- (Since this 
observer is fixed at Xz, for him dt* = —g,,,dx"dx” = —goo(Xp)dt*.) 

Here is the question for you to work out before reading on. When will the receiver receive 
the next signal? 

In coordinate time, she receives this signal At, after the preceding signal (according to 
the time translation invariant argument just given). But we have to express this in terms of 
the receiver’s proper time (namely the time she experiences). For the receiver, the proper 
time interval between the two signals is given by 


Ate = V= Boog Ate = Ate (,/B00(%n)/V 800) (3) 


We can translate this result into the frequency shift for an electromagnetic wave. Just 
think of the successive crests of the wave as the signals. Thus, the frequency wr seen by 
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Atp = At; 


At, 


“y 
XR 


XI 


Figure 1 Ina static spacetime, physics is invariant 
under translation in time. A light signal sent after 
another light signal would travel by the same tra- 
jectory simply displaced by some coordinate time. 


the receiver is related to the frequency w, of the emitter by 


WOrR-=OE 80%) (4) 
V So0(%p) 


We assumed that the emitter and receiver are fixed at x; and Xp, respectively, but for (4) 
to hold, all that is required is that the emitter and receiver do not move appreciably during 
atime ~ 1/o. 

Note that all we used is that the metric is time translation invariant. In particular, in the 


1420@p) yw 
T}20(Kp) 
®(X,) — ®(Xp). A receiver located in a region with a higher gravitational potential sees a 


weak field limit, we have the fractional frequency shift (wr — wg)/wp X 


lower frequency. We have recovered the gravitational redshift, which we derived using the 
equivalence principle in chapter V.2. A useful mnemonic is thata photon of energy fiw loses 
energy when climbing out of a gravitational potential well, but this mnemonic does not 
amount to a correct argument,! since it confounds a quantum relation with a Newtonian 
concept (namely the gravitational potential, which cannot be applied to a massless particle 
anyway). 

As this derivation makes clear, and as was already mentioned in chapter V.2, what is 
commonly called gravitational redshift should more accurately be called gravitational time 
dilation. The phenomenon involved does not have to be characterized by a frequency at all. 


Spacetime around a spherically symmetric mass distribution 


We are not yet in a position to determine the metric in any given physical situation, but 
from symmetry alone, we can learn a lot about the spacetime metric g,,,. The expanding 
universe in the preceding chapter is a good example. Here we examine the spacetime 
around a spherically symmetric mass distribution. The mass distribution may depend on 
time, such as that of a pulsating spherical star, for example. 
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Because of general coordinate invariance, we have considerable freedom in choosing 
the coordinates. This corresponds to picking a gauge in electromagnetism. At this point, 
the rich man with his or her wealth of fancy terms starts talking about Killing vectors, and 
possibly even foliation. We will get to all that later. But for the moment, it is pedagogically 
more transparent to follow the poor man’s way,” using explicit nuts and bolts, wearing no 
fancy pants. 

By assumption, space is isotropic: there is no privileged direction. Thus, the differential 
elements we use must be rotation invariant, and there are three such elements: dt, 
dx* = dx -dx, and X¥ -dx. Go to spherical coordinates so that as usual, dx? = dr? + 
r?(d0* + sin* Ody?) = dr? + r2dQ?. Differentiating r? = x -X, we have rdr = x - dX and 
so (¥ - dx)? =r*dr?. 

Pythagoras (as generalized to spacetime) requires the line element ds? to be quadratic 
in dt and dx. The inventory just given shows that we have four quadratic differentials to 
construct ds? with, namely dt?, dtdr, dr?, and dQ?. Isotropy means that the coefficients of 
these quadratic differentials in ds* cannot depend on 6 and g, and they are thus functions 
of only r and r. Putting it all together, we obtain ds? = —U(t, r)dt* — 2V(t, r)dtdr + 
Wt, r)dr* + (X(t, r))*dQ? with four arbitrary functions of t and r. 

Spherical symmetry has gotten us quite far, but we still have the freedom of changing 
coordinates. First, define a new radial coordinate ¢ = X(t, r), so that dF = 0,X (t, r)dt + 
a,X (t, r)dr. Eliminate r and dr in terms of 7, t, d7, and dt in ds* to obtain a mess of the 
form ds* = —U(t, 7)dt? — 2V(t, F)dtd? + W(t, F)d7? +7*dQ?. We have effectively gotten 
rid of X(t, r). 

There is no need to work out the mess; we merely note that it has the indicated 
form. Now we simply rename functions and variables by dropping the twiddles to obtain 
ds? = —U(t, r)dt* — 2V(t, r)dtdr + W(t, r)dr* + r2dQ?. (These are of course not the 
same functions we started out with, but there is no point in using up more letters.) 

Suppose somebody gives us this ds? with these three functions U, V, and W. We now 
show that we still have enough freedom to get rid of the nasty dtdr term. Define a new time 
coordinate f by df = ¢(t, r)(Udt + Vdr) = 4, ®(t, r)dt + 0, ®(t, r)dr, where the unknown 
function ¢(t, r) is determined by the condition that the second equality holds for some ®. 
In other words, we require that ¢(Udt + Vdr) bea total differential, so that the first equality 
makes sense. (The reader familiar with partial differential equations will recognize this as 
an often used trick.) Evaluating 0,0, = 0,0,®, we find 0,(¢V) = a,(¢U). Given U(t, r), 
V(t, r),and the initial value ¢ (0, r), wecan determine ¢(t, r) byintegrating this equation in 
t. Inany case, we don’t care about all the details, merely that in principle there exists a f with 
the stated property di = ¢(t, r)(Udt + Vdr). Eliminating dt in U(t, r)dt* + 2V(t, r)dtdr, 
we find that this expression becomes (¢7U)~'d?? — Y(f, r)dr?, where Y(é,r) is some 
function we don’t need to determine. All we care about is good riddance to the nasty 
cross term. 


* In theoretical physics, we also have the smart man and the dumb man. It may be swell to be a smart rich 
man, but I would venture that being a dumb rich man may be worse than being a dumb poor man. 
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Putting it altogether, dropping twiddles, and renaming functions, we finally obtain 
ds? =—A(t, r)dt? + Bit, r)dr? + r2dQ? (5) 


To summarize, we have used the spherical symmetry and exploited our freedom to 
change coordinates to reduce g,,,,(x), potentially 10 functions each of 4 variables, to A(t, r) 
and B(t, r), 2 functions each of 2 variables. This enormous simplification is typical of many 
problems in general relativity. 

The geometrical meaning of this metric is easy to grasp without any fancy math. Space 
is “foliated” by spheres S*, each with area 4r?, and with the gap between “successive” 
spheres, that is, the distance between (r, 6, y) and (r + dr, 6, ¢), given by B(t, r)dr. When 
coordinate time changes by dt, the elapsed proper time felt by different observers fixed at 
different values of r is given by A(t, r)dt. 


Motion in a static isotropic spacetime 


Let us restrict the mass distribution further to be static, so that A(r) and B(r) do not depend 
on time. Thus, the most general static and isotropic metric 


ds* =—A(r)dt? + B(r)dr* + r°d? (6) 


can be written in terms of two functions? A(r) and B(r) ofa single variable r. At this point, 
we do not know anything about these two functions, except that, far away from the mass, 
A(r)2 (1 — 26M) as r — oo, so that we recover the Newtonian gravitational potential 
@O(r) = —oM) To determine A(r) and B(r), we would need to master the dynamics of 
the gravitational field, and we won't get to that until part VI. But meanwhile, we can start 
studying the motion of a particle, such as a planet, in this curved spacetime by varying the 
Lagrangian 


1 
2 2 2 212 
pe a0 (4) Br) (4) ?? (2) r? sin? 6 (32) (7) 
dt dt dt dt 


By now, we can practically vary in our heads and immediately write down the 4 equations: 


d dt 

“ (ary) =0 8 

dt ( wt) 8) 

d dr dt\? dr \* do? dp\? 

— ( Br) — ) + 4A(r) (— ) — 1B into (4) =0 9 

dt ( ot) nee) (<) get) (+) : () oe dt e) 
2 

ae (2) —r* sin 6 cos @ (3) =0 (10) 

dt dt dt 

a (? sin? of) =0 (11) 

dt dt 

These are of course precisely the equations contained in (1). For example, (10) is just 


d d0\__ 1/8 dy,2 
dz (800 de) = 339800) (Gp): 
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It is important to remember, from our discussion in chapter II.2, that we are entitled 
to trade any one of the 4 equations (8-11), which all involve second derivatives, for the 
defining equation for dt, 


2 2 2 2 
aw (4) Bn (4) (2) Psindo (2) =1 (12) 
dt dt dt dt 


which only involves first derivatives. Of course, if we had any sense, we would trade (9) for 
(12). With this trade, these equations are not that difficult to solve. 
That math professor I referred to in the preceding chapter would dismiss most theorems 


as being so obvious that they are “self-proving.” In the same sense, (8) and (11) are self 
solving, yielding 


ght ae (13) 
dt A(r) 

and 
ka (14) 


dt r?sin*6 
with « and / two integration constants. (Do these two equations say anything to you?) 
Furthermore, we can solve (10) by setting 6(t) = 5. This is of course another consequence 
of the rotational symmetry of the problem: the planet stays in the equatorial plane. 
Plugging all this into (12), we obtain 


ee di\* oP 
ie 0 (4) 2 (19) 


Cleaning up and rearranging a bit, we find that, remarkably enough, we can cast this 
equation for an Einsteinian particle moving in curved spacetime in the form of an equation 
for a Newtonian particle (of unit mass) moving in a potential v(r) with zero total energy 
(with t playing the role of time): 


5 (HE) +0 )=0 (16) 
2\ dt ae 
with 
1 i? é 
=—_ 1 
= Bw (1+ 5) 2A(r) B(r) a) 


Once we are given A(r) and B(r), we merely have to solve a Newtonian problem in an 
unfamiliar potential! 


How light moves 


Earlier in this chapter, we recovered Newtonian motion for a slowly moving particle. Now 
let us treat the opposite limit and ask how light, or an ultrarelativistic particle, would move 
in curved spacetime. 
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That’s easy to do. Back in chapter III.5, we unified the action principle for material 
particles and the action principle for light, which I regard as one of the great achievements 
of special relativity. No more Lagrange on the one hand, and Fermat on the other! 

The equivalence principle now makes our life sweet: we merely have to take what we 
did in chapter III.5 and promote the Minkowski metric 17,,, to g,,. 


So write the action S =—m f \/—g,,,(X)dX"dX” (look at the manifest parametrization 
me aS since there is no ora for a particle of mass m in the form S = 


sf dg (a(S AF me) where (4 = = —8uy a ur , so that we can take the massless 
lini to sen 
dX" dx” 
S. = d » — 18 
massless — 3 7 g o(S) pv (f)) de dt ( ) 


Varying with respect to o(¢) tells us that for a massless particle, 
8 vd X"dxX" =0 (19) 


Varying with respect to X, we obtain (just as back in (II.2.16)) 


d dx" dX" dx” 
0 =) 
dt Ga dt i rT 


As in appendix 1 in chapter III.5, it is clearly advantageous to define an affine parameter 
by dt =0(¢)d¢’. Dropping the prime, we obtain 
d?x* dX dX” 
dc? HY de dt 


=0, (20) 


which looks superficially the same as (1). The difference is in the parameter choice. As 
usual, we trade the most complicated equation among (20) for (19), as has been explained 
ad nauseum starting in chapter II.2. 

It is perhaps good to summarize this business about natural parametrization. For curves 
in space, the length along the curve provides the natural parameter. For the worldline of 
a particle in spacetime, the proper time, namely the elapsed time in the rest frame of the 
particle, is the obvious candidate. For a massless particle such as the photon, no natural 
candidate presents itself, and we choose whichever parameter will make life easier. 

So, for light moving in the spacetime described by the metric (6), we have the same 
equations as (8-11) but with the proper time t replaced by the “affine parameter” ¢. For 
a photon ponte in equatorial plane (so that (10) is solved by 6 = 5), we have, once 


; = = 40 and 4 a = 4, with € and / two integration constants. Inserting this into 


(18), we obtain 375 ee i? L = 0 (which, of course, is just (15) with the right hand 
side set to 0 and with t > ¢). After rearranging, we obtain 


1far\? 1 1 2 
(a) BW (;; mam) =° en 


Once again, this looks like a Newtonian problem in an unfamiliar potential. But we still 


again 


have the freedom of scaling the affine parameter ¢ — ¢/J, and thus we learn that the 
physics does not depend on ¢ and / separately, but only on b* = a 
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incoming photon 


Figure 2 The impact parameter b in a scattering process. 


A better way of saying this is to eliminate the affine parameter ¢ by dividing a / oe = 7 : 
so that 


dr ‘ r* 1 1 
& B(r) (maw 5) (22) 


To identify this mysterious quantity b, we let r — ov, where space is nice and flat so high 


4 
wr 


school geometry applies. Then (22) becomes Gy ~ F which has the solution rg ~ b. 
Thus, we see that b is what in a scattering process is called the impact parameter. See 
figure 2. 

The fact that in a static isotropic spacetime, the motion of material particles and of light 
both reduce to a Newtonian problem can be traced back to the metric ds? = g yarn” 
being a quadratic form, and so in a sense, ultimately to Pythagoras. 


Parametrized post-Newtonian approximation 


We still have some distance to go before we learn how to determine A(r) and B(r) (in 
chapter VI.3), but meanwhile, dimensional analysis can take us quite far. You have probably 
heard of the celebrated solar system tests of Einstein gravity (which we will come to in 
chapter VI.3), such as the precession of the planet Mercury and the bending of starlight as 
it passes by the sun. Since G and the mass of the sun M always occur in the combination 
GM, as was already mentioned in part 0, A(r) and B(r) can only be functions of GM The 
gravitational field in the solar system is so weak ca“ <1) that it is entirely adequate for 
these classic tests to expand and keep only the leading terms in the so-called parametrized 
post-Newtonian (PPN to those who love acronyms) approximation: 


2GM m\? 
A()=1 a + 2(B (SF ) 5 
Cr Cr 
Bir) =14+2y (Sa (23) 
cr 


I have purposely restored c, so that you can see that the expansion parameter is the ratio 
of the Newtonian gravitational potential energy of a unit mass test particle to its rest mass. 
It is also illuminating to take the large c limit: 


ds? = —A(r)c2dt? + B(r)dr? + r2d0* + r? sin? 6dy? 
an Og aie "Peo (=) (24) 
r c 


Recall chapter IV.1. 
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As you will see in chapter VI.3, Einstein gravity gives 6 = 1 and y = 1. Over the years 
since 1915 there have been competing theories of gravity giving other values, but they have 
generally fallen by the wayside. Current observations bound the deviation of 6 and y from 
1 by two parts in 10* and in 10°, respectively. Einstein gravity is in excellent agreement? 
with observation, at least in this post-Newtonian approximation. 


Conservation laws and Killing vectors 


You may have recognized (13) and (14) as the general relativistic versions of energy 
conservation and angular momentum conservation, respectively. Going back, you see that 
these two conservation laws follow from the fact that the metric g,,,, in (6) does not depend 
on t and on 9g, respectively. 

In general, if the metric g,,, does not depend on a particular coordinate x’, then 
the geodesic equation for that coordinate, obtained as always (of course) by varying 
7 (g4ydx"dx”)? with respect to x”, simplifies immediately to 


im 

$ (ey) =0 (25) 
(In our example, (25) corresponds to (8) and (11).) In other words, g,,, dx" does not change 
as the particle moves along the geodesic. (For a massless particle, we merely replace the 
proper time by an affine parameter.) 

Since we are just applying the action principle, Noether’s theorem (as discussed in 
chapter II.4) directly implies these conservation laws. The action here [ (g,ydx"dx")? 
does not change upon shifting x* by a constant. 

This discussion can be rendered more formal as follows. Let the metric be invariant 
upon x > x + €€", with « some infinitesimal. Then é - on = Buyer ee is conserved. 
In general, there may be several such és, known as Killing vectors. In our example, 
é# = (1,0, 0, 0) and er = (0, 0, 0, 1). 

I rather dislike such apparently useless formal manipulations, but later in chapter IX.6, 
we will see that Killing vectors describe isometries of spacetime, an important and useful 
concept. For now, however, simply think of writing ¢ - a as a shorthand for the more 
descriptive Bi Another way of saying this is that the momentum of a particle as 
usually defined (p“ = m4“) is not conserved, but its component € - p along a Killing 
vector is. With our sign convention, the energy of the particle is given by E = —€, - p and 
the angular momentum by L = € - p= mr? sin” 6 ae. 


Appendix: Christoffel symbols around a time independent 
spherically symmetric mass distribution 


We can read off the Christoffel symbols from the geodesic equations. For instance, (8) works out to be a + 


a g¢ dt — 0, from which we read off =I", = A. We will now list all of them: 
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Al Al B' in’ 0 
r fo Re Ue peices: i a 
"2A 2B " 2B B Pe B 
0 = 
V6 a? Uys ae 
ie =-—sin@cos0, is =cotdé (26) 


with all other components not related to these by symmetry vanishing. 

We can understand a lot by symmetry considerations. For example, because the metric is invariant under 
t + -t, Christoffel symbols with an odd number of t indices must vanish. Also, wae =-—sin6@cos6, lie =coté 
are the same as those for S* found way back in chapter II.2 and merely reflect the spherical symmetry of the 
metric. 

See also exercise 3. 


Exercises 


1 Work out the potential u(r) in the parametrized post-Newtonian approximation, assuming that and y are 
of order unity, and sketch its general form. 


2 Calculate I, for the Christoffel symbols in (26) and verify an identity derived in chapter II.2. 


3 Find the Christoffel symbols for the time dependent spherically symmetric spacetime in (5). Show that we 
simply have to add* 
A B B 
Re oa Vir = 54° LS aR 


(27) 
(with the dot indicating 2) to the list in (26). Note that this is consistent with the transformation t > t. 


1, dr? + r?dQ’, particles with constant r, 0, y are actually freely 


4 Show that in the spacetime ds* = dt? + 2am 


falling. 


5 Show that in the spacetime ds? = x _ dt? 4 = ior dr? + r?dQ?, freely falling particles with constant 4, y 


starting at r > 2GM actually fall toward larger r. 


6 Show that the geodesic followed by a massless particle is also the geodesic followed by a massless particle in 
a conformally equivalent spacetime. 


Notes 


1. Beware of some textbooks on this point! 

2. A historical note: until the 1950s, the notation goy = —e””) and g,, =e” (or this form with some other 
“suitable” letters) was used. The “nonexponential” notation used here appeared later. 

3. Periodically, there are reports of deviation from our understanding of gravity, both Newtonian and Ein- 
steinian. Most of these eventually either go away with better measurements or are found to be due to 
“mundane” causes. (No doubt some 19th century physicists could say the same about Mercury’s perihelion 
precession (see chapter VI.3) before Einstein came along.) One interesting anomaly is the so-called Pioneer 
anomaly regarding the observed accelerations of the Pioneer 10 and Pioneer 11 spacecrafts after they passed 
out of the solar system. As this book was being completed, this anomaly had been determined to be due to 
mundane causes. 


4. In S. Weinberg, Gravitation and Cosmology (1972 edition), I’. is missing a time derivative on p. 336. 


V5 Tensors in General Relativity 


The mother of all vectors 


I have insisted again and again that the laws of physics should be expressed in terms of 
vectors and tensors, so that what different observers see can be simply related. I will now 
talk about vectors and tensors for the third time. 

My pedagogical philosophy in this book can be expressed as “one baby step at a time.” 
We started with a discussion of vectors and tensors under rotations in chapter I.3. Then we 
discussed vectors and tensors in special relativity, and encountered the novelty of having 
to keep track of upper and lower indices. The coordinates transform as x’“ = A x’, or 
better, the coordinate differentials transform as dx'" = A” dx’. In this case, the insistence 
on talking about differentials hardly matters, since A‘, is a constant matrix and so the 
differential version of the transformation follows trivially upon differentiating: dx” and 
x” transform in the same way. 

In both these cases, rotation and Lorentz transformation, vectors are defined as objects 
that transform like dx”: we could say that dx” is the “ur-vector” or “the mother of all 
vectors.” Tensors are then defined as objects that transform as if they were built out of 
vectors. 

Now that we have mastered these two baby steps, we are ready to leap to vectors and 
tensors in general relativity! 

Under a general coordinate transformation x > x'(x), the coordinate differentials trans- 
form as 


a ax'# v_— ou v 
dx* = aes = Si (x)dx (1) 
To save writing, we have defined the transformation matrix S“(x), which plays the same 
role as A“ in a Lorentz transformation, but with one huge difference: unlike A, S depends 
on x. The transformation law changes from place to place. 
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Another important point is that our insistence on coordinate differentials, which seemed 
so academic and nit-picking, has finally become de rigueur: by definition dx“, but not x“, 
transforms like a vector. In general, the transformation x” — x’ is arbitrary and nonlinear. 

You must master this point: dx" transform linearly, but not x”. There is all the difference 
in the world between these two different mathematical creatures. 

Indeed, I have prepared you for these two steps also in chapters I.5 and 1.6, discussing 
coordinate transformation and curved spaces. The formalism here is exactly the same, 
except the signature is that of spacetime rather than that of space. You have already en- 
countered essentially everything we will discuss presently. As I anticipated in chapter I.5, 
the concepts needed for curved spacetimes have their seeds in the conceptually simple 
transformation from Cartesian coordinates to spherical coordinates that every physicist has 
dealt with since childhood. The point we just emphasized can already be seen in making a 
coordinate transformation: while x, y, z andr, 0, y are related to each other nonlinearly, 
dx, dy, dz and dr, d0, dg are related to each other linearly. Granted, the matrix that 
relates dx, dy, dz and dr, d@, dy may have elements that involve highly nontrivial func- 
tions (such as trigonometric and inverse trigonometric functions), but the important 
property is the linearity. To repeat, although the functions S“(x) may be algebraically 
quite complicated, the relation dx'* = S“(x)dx” is nevertheless linear and hence easy to 
manipulate. 


Vectors and the construction of tensors 


A vector W(x) (or more precisely a vector field: a vector that depends on x) is defined as 
something that transforms under a general coordinate transformation like the ur-vector 
dx" does in (1): 


W(x’) = SH (x)W"(x) (2) 


Notice that W’” is evaluated at x’, while W” is evaluated at x, but these are just different 
coordinate values describing the same point P. 

Remember the student in chapter I.4 who was puzzled by the statement that a tensor is 
something that transforms like a tensor. In fact, he should already have been puzzled by 
the statement that a vector is something that transforms like a vector, an example of which 
is the ur-vector dx". 

Just as before, we define a 2-indexed tensor T“”(x) as something that transforms like 


TMG) = SH Se )T EO @) (3) 
We can now go on to define tensors with as many indices as we want, transforming like 
TiO!) = SH (x)S*, (x) hs: Soiree) (4) 


As I warned you, I am literally saying some of these things for the third time. 
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Upper and lower indices 


The transformation of the metric is fixed by the invariance of the proper length, exactly as 
in chapter 1.5, when we rather innocently first changed coordinates, 


ds? = = Bq (x')dx'Pdx'? = Syv(x)dx"dx” = = gu a ae dx'?dx'? (5) 
so that 
Big (¥') = Suv(e (SY) (a(S) @) (6) 


with the definition (S~ ae = bar. once again. For your enlightenment, I even used the 
same notation in chapter 5. And, yes, yet once again, using the chain rule discovered by 


our forefathers who invented calculus, we verify (S~ aa pv = ax gx = = ox = 6“, so that 
S~1is indeed the inverse of S. 

A few points are worth emphasizing here. The need to maintain the summation conven- 
tion of always summing an upper index with a lower index forces g,,,, to carry lower indices, 
since it is by definition yoked to dx“dx”. The transformation is then thrust upon us. 

Tensors with upper indices transform as in (4) with S, while here a tensor with lower 
indices, namely the metric itself, transforms “oppositely” with S~!, namely like e c= 
8uv(S eee (S~1)” . No surprise here at all: the upper and lower indices are contracted in 
the invariant ds* = g,,,(x)dx!dx". Exactly as in chapter III.3, we can define the transpose 
of S~ by (Cae ere = (See (Notice that, just as in chapter III.3, when we transpose we 
do not move anybody up and down stairs.) We can then write g’ = (S~!)’gS~1, regarding 
the metric as a matrix. 

Thus far, I have carefully indicated that unprimed tensors depend on x and primed 
tensors depend on x’. (Of course, since x and x’ are related, we can always regard a function 
of x’ as a function of x and vice versa, but this way, in which things are usually written, 
is more natural.) We adopt the convention of thinking of S and S~! as functions of x. To 
avoid clutter, we will henceforth often suppress the x and x’ dependence of various objects 
if there is no risk of confusion. 

Once again, as in our earlier discussion of coordinate changes and curved spaces, we can 
use the metric g,,,,(x) to lower indices. Given a vector W”, we invite ourselves to construct 
a vector with lower index 


Wi = 8uvW" (7) 
No prize for your correct guess on how Ww, transforms: 
Wi =e WH ea (S LS VS Wo = Wye (8) 


Speaking colloquially, we can say that when a lower index and an upper index (o in this 
example) are summed over as in the Einstein convention, the associated transformation 
matrices S~! and S knock each other off. Compare (8) with (2). 
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Given two vectors W and U, we can form, in analogy with ordinary Euclid 3-vectors and 
Lorentz 4-vectors, the dot product W,,U“ = g,,,W’U". In light of the preceding remark, 
we expect W,,U* to transform like a scalar, that is, it does not transform at all: indeed, 


WU? = WS S8 = WU (9) 


As far as the transformation properties are concerned, the summed-over pair of indices 
effectively disappear. Indeed, we can define ¢(x) = W,,(x)U“(x), and it transforms like a 
scalar $/(x’) = f(x). 

Now that we know how a vector with a lower index transforms, we can define tensors 
with lower indices. For example, the tensor T,,,,, transforms as if it were (even though it is 
not) built up of three vectors W,,V,U,,. Indeed, we can clearly define tensors with arbitrary 
numbers of upper and lower indices, transforming like 


Tle __ Vio Ga PP Te e-1yE . ee lyw 
UE OS SES eNO T PS Ye ee 8 yy (10) 


You get the idea before I run out of Greek letters! As before, fancy people who like big 
words call the upper indices contravariant and the lower indices covariant. Upper and 
lower indices transform oppositely. 

Certainly, there are deep mathematical reasons underlying the appearances of upper 
and lower indices, but at a pedestrian level, just as in our discussion of Lorentz vectors 
and tensors, you can simply regard lower indices as a notational device to avoid writing 
yy all the time. Also not surprisingly, since we use g,,, to lower indices, we might expect 
to use its inverse to raise them. 

We define the inverse metric g°“ by g°“g,,,, = 5°. In other words, define the inverse 
metric as the inverse of the metric regarded as a matrix. (As I remarked in connection with 
the Minkowski metric 7,,,, and its inverse n’”, the spacetime metric g,,, and its inverse 
g°" are also denoted by the same letter g but distinguished by the position of their indices, 
a potential source of confusion for some seeing this for the first time. Things are just as in 
chapter III.3, where we used the Minkowski metric n,,,, and its inverse to lower and raise 
indices, respectively.) 

We can check that the inverse metric raises indices by contracting g°” with W,,: 
g°EW, = 9°" 2,,,W" = 5° W” = W°; indeed, we get W° back. Once again, as in our dis- 
cussion of Lorentz tensors, we can lower and raise indices at will using g,,, and g“”, 
respectively. For example, TY* = g,, T°”. Incidentally, it is common practice to use the 
same letter T to denote entirely different tensors, distinguished by the number and kind 
of indices they carry. 

Taking the inverse of g’ = (S~!)" gS~!, we see that g/~! = Sg~15S7, or written out more 
explicitly, 


EP OHS OS f= 8 soe Gt (11) 


Mp 


We can, if you wish, check the obvious, that the transformed g’”” is indeed the inverse of 


the transformed g’,, in (6): 


BM hg = SSBB yy (SY (Sg = Si 8 By y(S) g = SY SE(S), = 8 
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By now, you should be familiar with how this works; for example, in the second equality, 
S? and (S a2 knock each other off. 

To summarize, we can define tensors with however many upper and lower indices we 
like. Each upper index transforms with S, and each lower index with S~!. For example, a 
tensor with one upper and three lower indices transforms like 


WP = SWS (SYS DY, (12) 


puv onk 
In the repeated index summation convention, we are allowed to contract an upper index 
with a lower index, namely to set them equal and sum over them. For example, setting 4 
equal to yz in (12) and summing over them, we obtain 


We SEW oe ) gS) SO) = OLW eS 8) = WS) (Sy 


puv @nk nk nk 


In other words, as you might expect, W,,, = W",,, transforms like a tensor with two lower 
indices: in (12), the S knocks off an S~1 as explained earlier. 

You are never allowed to contract an upper index with an upper index, or a lower 
index with a lower index. The reason is obvious. Suppose you set jz and v equal in (12) 
and sum. You would encounter, instead of an S and an S~! knocking each other off, 
Vs “Hi S eer which is not anything in particular. Ifyou want to set two upper indices 
(or two lower indices) equal and sum over them, the correct procedure is to multiply by the 
metric (or the inverse metric). For example, Ww’ ie” is a legitimate tensor. We can regard 
the contraction with the metric (or the inverse metric) as a two-step process: we use the 
metric to lower (or the inverse metric to raise) one of the two indices, and then contract an 


@Vv 


upper index with a lower index. Thus, in our example, first define wie =W and 


ws 
then evaluate W*"". 

I keep belaboring the obvious, but as I said earlier in chapter 1.5, I want to make 
sure that the rest of the book will not pose any difficulty for you. We can of course also 
contract an upper index from one tensor (for example A“”) with a lower index from another 


T 


tensor (for example BY). This is a trivial statement, as we can always define the tensor 


Awpn 
HVT __ yu pt , , : 
Lira re and then contract any upper index with any lower index on T (for 
example, 7," aa producing the tensor 7." ale 


It might be worth emphasizing that I always write the coordinates x” with an upper 
index. Formally, if we run into the combination g,,,dx”, we could call it dx, if we insist, 
but the symbol x,, by itself is meaningless, or at least not useful. 


Quotient theorem 


Iclose by mentioning an obvious truth that is sometimes elevated to the status of a theorem. 
As I just said, if you multiply two tensors together and contract some upper indices with 
lower indices, the result is evidently a tensor. For example, if Q and W are tensors, then 
PS = Q°“W%,,. is a tensor. The proof is straightforward: when we go from unprimed 
to primed coordinates, for each pair of contracted indices, § and S~! knock each other 
off, and the various Ss and S~'s left dangling are precisely what are needed to make the 
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left hand side a tensor. The quotient theorem states that given two tensors P and W and 
Po = Woe? WE Can conclude that Q is a tensor. I will leave it to you to prove this 
more than plausible assertion by simply writing down the transformed versions of this 
equation and “peeling off” Ss and S~'s. Think of tensors as delicate contraptions “turned” 
by various factors of Ss and S~'s upon a coordinate transformation. Unless Q also gets 
turned by the appropriate factors of Ss and S~'s, there is no way that P will transform 


correctly. I will often use this theorem implicitly. 


Lorentz transformation, change of coordinates, curved space, 
and curved spacetime: Not quite the same deal 


I keep emphasizing that one unified formalism can be used to discuss rotation, Lorentz 
transformation, change of coordinates, curved space, and curved spacetime. But it is also 
important to understand the crucial differences among them. Let us summarize and 
contrast. 

A Lorentz transformation A transforms the Minkowski metric into itself (as stated 
in (IIT.3.8): 


Noo = (AP) iA (13) 


This requirement of invariance imposes a restriction on A and defines the Lorentz group. 
Rotations may be treated as a special case of this. We simply mentally replace 7 by the 
unit matrix and A by R. The restriction on R defines the rotation group. 

When we transform coordinates, or study curved space, we have 


Ba ®)= (C9) © suvl2VS-', (14) 


While (14) looks deceptively similar to (13) with S~'(x) playing the role of A, it conveys 
quite a different message. This equation is not an invariance requirement like (13), but 
rather informs us about how the metric g,,, transforms under a coordinate transformation 
described by S(x): it tells us what 80%) is. However, there is also a subtext about 
invariance, namely a statement of how two metrics g’ and g should be related in order 
to describe the same geometric entity. A key difference is that in the left hand side of 
(13), primed quantities are nowhere to be found. In contrast to (14), (13) informs us 
that a very special metric, written down by a certain Mr. Minkowski, and before him a 
certain Mr. Pythagoras, is left unchanged by a group of linear transformations, Lorentz 
transformations in one case, and rotations in the other. 


The equivalence principle is not a statement about symmetry 


Next, as we proceed from curved space to curved spacetime, we need Mr. Einstein’s 
additional insight oflinking the metric to the gravitational field. Formally, we have the same 
equation (14), but now it informs us that the physics of one observer under the influence 
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of a gravitational field is related to the physics of another observer under the influence of 
some other gravitational field. In particular, according to the equivalence principle, one 
observer could even feel no gravitational field; that is, g,,,, may happen to be equal to n,,,,. 
In this sense, the equivalence principle is reminiscent of the statement that we can always 
transform to locally flat coordinates. 

In short, a coordinate transformation in general relativity relates the physics seen by 
two different observers, even if one of them reports seeing a gravitational field and the 
other not. If you wish, you could say that there is no such thing as a gravitational field, 
only curved spacetime, or vice versa, as discussed in chapter V.2. 


Differentiating scalars, vectors, and tensors 


Let (x) bea scalar under general coordinate transformation, so that $’(x’) = ¢(x). How 
does 4,,¢(x) transform? Again, by now, you would have guessed that it transforms like a 
vector with a lower index, and indeed 


dp'(x') _ Ax” Ab) _ 1)” 
ie ae a (15) 


a g(x’) = 


= 9 
Mi Ox? 


appears in the denominator, it acts effectively as a lower index. Another way of thinking 


Speaking loosely, we might say that in the definition @ since the upper index yw 


about this is to regard d,,, the ur-vector with a lower index, as the “dual” of dx", the ur-vector 
with an upper index.* It is useful to remember that 
—1\Y 
a, = (Sa, (16) 


We write (S~4) i" to the left of 3, to make clear that the derivative does not act on S~!, 
An important question: given a vector W(x), do you expect 0, W(x) to transform as a 
tensor? Think about this for a moment before reading on. 


Some key things to know about tensors 


Let us collect together some key things you have learned about tensors in general relativity: 


1. The indices on a tensor transform independently, that is, as if the other indices were not 


there. 


2. Tensors in general relativity work pretty much the same way as the tensors you are familiar 


with in connection with the rotation group except for two important differences: 


a. The transformation matrix S“ changes from place to place. 
b. There are two floors, and you have to use g,,, and g” to move indices upstairs and 


downstairs. 


* Some readers may recognize that I am sneaking in some notions of a more modern approach to vector and 
tensor analysis by speaking of dx” and d,, as dual ur-vectors. 
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3. Always contract an upper index with a lower index. It doesn’t make sense to contract an upper 
index with an upper index. You can multiply two upper indices by g,,,, but that amounts to 


moving one of them downstairs first, and then contracting. 
4. The coordinate x“ is not a vector, but dx" is. 
5. The two ur-vectors dx and d,, transform oppositely. 


6. The ordinary derivative of a tensor is not a tensor. You must use the covariant derivative. 


Statement 6 answers the question I asked you in the previous section. The next chapter 
is devoted to explaining and elaborating this statement, in particular, to finding out what 
the covariant derivative is. 


Appendix: Index-free representation of vector fields 


Statement 5 suggests that dx” and @,, could be used as basis vectors. We could exploit this observation to develop 
an index-free formalism along the following line. Given a vector field A“ (x), we can define an index-free object 
A(x) = A"(x)d,,. An abecedarian might take some time to get used to this notion of regarding vector fields as 
differential operators, but it leads us naturally to consider the commutator C = [A, B]. More explicitly, C’d, = 
(A“,,)(B?0,) — (A & B) = A"(3,,BY)d, — (A  B), since 0,0, = 4,9,,. Thus, C’ = A"(3,,B”) — B“(3,,A”). In 
other words, we differentiate the vector field B in the direction of the vector field A, interchange A and B, and 
then compare the two results. 

If you are reminded of the commutators in the Lie algebra introduced way back in appendix 2 to chapter I.3, 
you might have also suspected that a deep connection exists. Indeed, the notion of representing generators by 
differential operators instead of matrices also appeared there. For example, we had —iJ, = (y 2 —-x #) for the 
generator of rotations about the z-axis. In the notation used in the preceding paragraph, what we did in chapter 
1.3 amounts to writing —iJ, = A”d,, with the vector field A” defined by (y, —x, 0). This provides another way of 
calculating the Lie algebra for the rotation group, for example, [J,, J] = iJ,. You can readily verify this relation, 
regarding J,, J,, and J, as differential operators and using elementary calculus. 

Similarly, given a vector field A u(x) with a lower index, we can define an index-free object A(x) = A plx)dx", 
We will come back to this in chapter IX.3. 


V. 6 Covariant Differentiation 


“How do you transform?” 


In our continued effort to honor the fundamental principle that physics does not depend 
on the physicist, we have to ask, sort of as in daily life, every new object or expression 
we encounter, “How do you transform?”! Physical laws are to be formulated in terms of 
objects that transform properly. 

In the preceding chapter, we verified that, given a scalar (x), its derivative or “gradient” 
0,6 (x) transforms like a vector with a lower index. This also justifies the shorthand 
notation 0,, = ain 
object. Quite naturally, we would like to go from scalar to vector and beyond. 


Taking the partial derivative of an object, we add a lower index to the 


At the end of the preceding chapter, I asked you a crucial question: given a vector W(x), 
how does 0,W*(x) transform? Naively (actually, very naively), you might guess, just by 
looking at the indices the object 0,W” carries, that it transforms like a tensor 7," with 
one upper and one lower index. But you can see that can’t be true just by looking at the 
transformation law for W(x) 


W' (x') = SEX) W(x) (1) 
The object 3, W“ (x) transforms to 0; W’"(x’). We have 0, = (S510; by the chain rule. So 


act with d, on (1) using the product rule. We watch, in horror, 0, hitting S$“, thus wrecking 
the nice tensor transformation law. Instead, we obtain 


aW'h(x') — ax? 9 ii 7 
aes aah oop (S4, (x) W"(x)) 


= (S716 sH#a,w? + (S~1)5 8,54) W” (2) 


W(x!) = 
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The fact that the transformation S varies from place to place has negated the naive guess. If 
the second term, which comes from differentiating S, were absent, 0, W” would transform 
like a tensor!* The naive guess would be valid. 


Wanted: A derivative that transforms properly 


What is happening is quite clear: as the vector W varies from a given point to a neighboring 
point, the coordinate axes that define the components of W also change. This suggests that 
we can define a more suitable derivative, known as the covariant derivative and written as 
D,W"*, to take this effect into account, so that D, W“ would transform like a tensor with 
one upper and one lower index. 

It is also instructive here to compare (2) with something from way back, (1.4.3). There 
the rotation matrix R, the analog of S here, sails right past the derivative, so that, under 
rotations, the derivative of a vector field is a tensor. The other difference is that R7! = R’, 
which is not true for S. 

We already encountered a similar problem in chapter I.5. The simple minded divergence 
d,,W" (x) does not transform properly: 0,,W"(x) 4 owe (x’). It has to be corrected by an 
additive term to 


1 
D,,W" = 8,,W" + (haw) we (3) 


This offers a strong hint about what to do. Construct D,W*" by adding something to 
d,W* to cancel the second term in (2). We want the covariant derivative D,W" to have 
many of the properties enjoyed by the ordinary derivative 0,W", for example, linearity in 
W (so that multiplying W by 2, say, doubles D, W"), which requires that the added term 
must be linear in W just as in (3). The most general expression with the correct index 
structure is then D, W“ = 0,W“ +! W”. We need an object I° with one upper index and 
two lower indices, and we specifically want it not to be a tensor. 

But we are already acquainted with such an object, the Christoffel symbol I from way 
back in chapters 1.7, II.1, II.2, and V.3. Our notation I’ is intentionally suggestive! To see 
that Pr might work, we recall its definition: 


a = 58" (Ox. 8ve + O8oa — I58rv) (4) 


It involves ordinary partial derivatives of the metric tensor g,,, and thus, for the same 
reason as in (2), [ can’t possibly transform like a tensor. We can thus hope that the 
nontensorial piece in the transformation law of T will cancel precisely the unwanted second 
term in (2). This “smells right”: the derivatives of g,,, in (4) describe the very effect we are 
worried about, namely the variation from point to point of the coordinate axes relative to 
which the components of our vector W“ are defined. 


* One of my professors, the distinguished theoretical physicist Murph Goldberger, was fond of shouting, “If 
my aunt had balls, she would be my uncle!” Believe me, this made a deep impression on a Chinese kid fresh 
from Brazil. 
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Overheard at a party: “What, you’re also nota tensor? Me neither!” “Maybe we could hook 
up and become a tensor?” There is a Chinese saying that those who are similarly afflicted 
empathize with one another. And so 0,W“ and I’, W” could join hands to form one of 
those ideal couples in which one person’s character defects cancel the other person’s. 


Canceling the nontensorial pieces 


So let’s define the covariant derivative 
DW" =0,W* +T,W” (5) 


and show that it indeed transforms like a tensor. (At this point, we actually do not know that 
the two terms in (5) should be added with relative coefficient 1. But instead of cluttering 
things up with an arbitrary constant in front of the second term, we will show that the 
expression as written works.) 

In fact, we already know how the Christoffel symbol transforms! Go back to what 
Professor Flat taught us in chapter II.2. For convenience, I copy (I1.2.31) here (after 
relabeling some indices): 

Phe = SSNS TG g + SS a(S”), (6) 


kK WO 


We now plug this into (5) and show that D, W“ transforms nicely like a tensor. 

Deep down in our hearts we already know* that it must work. Still, it is fun to see how 
the different pieces come together and knock each other out. 

From (6), we see that we need to determine 0S~!. For any (invertible) matrix M, 
differentiate MM~! = I to obtain (9M)M~!+ M8M~—!=0, which allows us to relate the 
derivative of M~! to the derivative of M: 


aM~'=—M~'(aM)M~! (7) 
Notice that this generalizes what you learned in a calculus course on how to differentiate 
the inverse of a function d (+) =— s- 


Using the identity (7), we write the second term in (6) as 
S#s-)4.a,(S 7. = —(S-,@,S4 (SY, 


Plugging this into (6) and multiplying I’ by W" = S* W*, we finally obtain (after quickly 
renaming indices) 


CW Sse OE Ww aS) 0S Ww" (8) 
Again, for convenience, I copy the nasty (2) we started this chapter with here: 
dW" (x') = (S14 SHOW” + ((S~1)4, 8, Sh) WY (9) 


* From a review of my book QFT Nut for the American Mathematical Society: “It is often deeper to know why 
something is true rather than to have a proof that it is true.” 
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Adding (8) and (9), we see that the character defect in each of them cancel each other, so 
that indeed D, W(x’) = (S GSH D,,W’” transforms like a tensor with one upper and 
one lower index. 


The covariant derivative from geometry 


Another motif from an earlier part of this book resonates with the present discussion. 
Recall that, in chapter I.7 on classical differential geometry, the mite professors came 
to the concept of covariant derivative by simply dropping from the vector a,W(x) the 
component sticking out of the surface. Also recall that in exercise 1.7.6, you showed 
> 3 ax 

hates 8 p= T pas = at? we see 
that under the coordinate transformation x > x’, we have é@ ‘ = ai x = (S7}) ev and so 
> > —1\o as n> —]\W > . 

La =e : -é ma = (S7} pS 1) pein) 1 plo) We can see the transformation law 
of Christoffel symbols emerging. 


From the definition of the basis vectors en = a,x = 


Although the discussion of classical differential geometry in chapter 1.7 is nominally 
only for surfaces, it clearly generalizes to curved spaces and spacetimes. Some readers 
might prefer this derivation, which is more geometrical and intuitive. 

In contrast, the derivation given in the preceding section, based on requiring that D, W” 
transform properly, is more abstract and high powered. As I mentioned, this requirement 
of proper transformation pervades theoretical high energy physics in recent decades. In 
this sense, this derivation might be considered more modern and general. 


A wildly varying vector field? 


At this point, our friend the rich man could start spouting fancy talk about the covariant 
derivative, presumably without writing down a single index and disdaining such “quaint 
old-fashioned notions” as transformation, and thus cause our other friend the Jargon Guy 
to become flush with joy. Instead, let’s be more modest and, together with our friend the 
poor man, try to understand what the covariant derivative really means by working out a 
simple example. Again, a tale best told through a fable. 


A civilization of mites used the coordinates r, 6 with the metric g,,.=1, g,9=0, g99 = 
r?. One day they discovered a wild and woolly vector field W“(x), which they eventually 
determined to be given by W’(r, 6) =cos6, W%(r, 0) = -t sin 6 (measured with error 
bars of course but well described phenomenologically by these expressions). For example, 
a scientific expedition sent to the point (r, 6) = (3, 30°) measured W'(3, 30°) = a 
w°(3, 30°) = —. Another expedition sent elsewhere reported vastly different values for 
W*, and so on. Eventually, a table of the 4 quantities a, W” was published to guide travelers. 

One day, a bright young guy pointed out that the mite savants should have calculated 
the covariant derivatives D,W“ = 0,W" +1',W”, instead of the ordinary derivatives 
d,W". The symbol I’, had already been determined (way back in chapter II.2 by studying 
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geodesics) to be, = —r andl’, = +. For example, D,W® = 0,W° + T%,W? = sin 6 + 
i(-1 sin 6) = 0. You should now go on and verify that all 4 quantities D, W” vanish. 

Quickly, another young guy showed the elderly savants that if they were to transform 
coordinates to x =rcos@, y=rsin@, then in these newfangled coordinates, W* = 
ax wr + 2x w* =cos@W’ —rsin@W° = 1 and (as you should show) W’ = 0. After the 
fact, the savants plotted W"(r, 6) =cos6, W°(r, 6) = -} sin @ in polar coordinates and 
saw that, indeed, the vector field W” was constant. That 0, W” did not vanish was merely 
due to the coordinate basis vectors é, and é, varying. 

The young guys explained: “Saying that a vector field is constant must mean that the 
covariant derivatives, instead of the ordinary derivatives, vanish.” Since D, W is a tensor, 
if it vanishes in one coordinate system, it vanishes in all coordinate systems. The same 
statement cannot be made of a nontensor like 0, W“. 


Covariant derivative of tensors 


We have determined the covariant derivative of a vector with an upper index. What about 
the covariant derivative of a vector with a lower index? 

Here I will switch to the more compact notation W“, = 0,W" and W“, = D,W™ (the 
first of which you already encountered in chapter I.7), also commonly used. 

As we have seen, the covariant derivative of a scalar is simply the ordinary derivative: 
suffices to fix the covariant derivative of a vector with a lower index. Insisting that the 
covariant derivative, just like the ordinary derivative, satisfies the product rule, we have 
(U,,W*)., =U, We + U,W',. In contrast, since U,,W" is a scalar, we have 


(U,W).. =U,W"),, (10) 
For convenience, let us rewrite (5) in this semicolon notation as 

Wa Wal (11) 
We see that the condition (10) is satisfied if 

Ousa = Una MU (12) 


Contrast the minus sign in (12) with the plus sign in (11). We can readily check that the 
opposite signs ensure that the Christoffel symbols cancel out, thus giving us (10): 


(U,W) = Oya —TG,Us)W" + UW", + 05W") =U, We + UW" = (UW) 3 


The covariant derivative of tensors with more indices, such as rm, can be worked out 
by pretending that 7) = W“Y°U, is made up of three vectors, so that we can use (11) 
and (12) repeatedly. The pretense works because all we care about in this context is the 
transformation property of T rather than its true nature. Thus, 


i a ea 5 paee + i Be + ae hae = LS ae (13) 


Once you know how to differentiate vectors, you know how to differentiate tensors. 
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The general rule should be obvious: colloquially, for each upper index, attach a +I, 
and for each lower index, attach a —-. We now know how to repeatedly take the covariant 
derivative. For example, since U,,., is a tensor, we can apply the rule and write 


Gusrsp = Gusa.p — Pr Vain 7 Py Unio (14) 


Starting from (11), we defined the covariant derivative in such a way that the product rule 
is clearly satisfied. For example, (S?T#)., = SP, TH + SPT. 

Mirror mirror on the wall, what is the most special tensor of them all> I trust that the 
magic mirror would say the metric tensor g,,,. Would you be surprised to learn that the 


covariant derivative of the metric tensor vanishes? You should check that indeed 
Suv; = 0 (15 ) 


Of course, the ordinary derivative g,,,,, is assuredly not zero. In other words, the metric 
tensor is not (necessarily) constant, but it is always covariantly constant. That sure makes 
sense.* 

In the preceding chapter, I defined the commutator C = [A, B] of two vector fields A” 
and BY”. Since the Christoffel symbol is symmetric in its two lower indices, the ordinary 
derivative in the definition can in fact be replaced by covariant derivatives: 


C” =[A, B]’ = A"(a,B’) — B“(3,,A") = A“(D,,B”) — B“(D, A’) (16) 


To get a feel for the covariant derivative, you should practice writing down a few more 
examples. 


Electromagnetism in curved spacetime 


Notice that the covariant curl is equal to the ordinary curl 
Usa ~ Dyin = Un, Dh y (17) 


The Christoffel terms cancel. In particular, in curved spacetime, the electromagnetic field 
strength is still? given by F wv = 9,A, — 0,A,,, a fact we can now exploit. 

In chapter V.2, I extolled the power of the equivalence principle. As another exam- 
ple, the equivalence principle tells us that we can immediately obtain the action of an 
electromagnetic field in the presence of a gravitational field by promoting the Maxwell 
action —} f d*x FY’F,, =—4 f d4x 1"? Fyy Fyy in (IV.2.6) to 


SMaxwell =. -} / d*x Vv =e! oP Fy Fy (18) 


* Some people prefer the following slightly more mathematical approach to the covariant derivative. After 
defining D,W" = 0,W" + raw with an unknown object I, as was done earlier in this chapter, extend the 
definition to the covariant derivative of a tensor. Then impose the condition D, g,,,, = 0 to determine 3 

¥ Indeed, in chapter V.4, you might have wondered how F, uv is defined in curved spacetime. 


326 | V. Equivalence Principle and Curved Spacetime 


In light of the preceding remark, in (18) the effect of the gravitational field on electromag- 
netism has been explicitly displayed. 


A matrix identity, the patented “1-2” test, and the covariant divergence 


We have one loose end to tie up. As noted in (3), we have already encountered the covariant 
divergence D,,W“ = Wa d,(./—gW") way back in chapter 1.5. However, we can also 
obtain the covariant divergence by contracting the indices in (5): D,W" = 0,,W" + Tl, W”. 
1 

From (4), we have I'l, = 38" 08 y5- 

For these two forms to agree with each other, so that the laws of arithmetic are upheld, 
we must have g470,2¢,,, = 3a, g. There must be a cool matrix identity? involving the 
determinant, and indeed there is! 


For any diagonalizable matrix M 
log det M = tr log M (19) 


(The logarithm of a matrix can be formally defined by a power series log M = log(J — 
(I — M)) = 0°, — M)‘/k.) To prove (19), simply diagonalize M = A~'DA with D 
a diagonal matrix with entries d,, dy, --- , dgimension of M- Phen log det M = log det D = 
log[]; 4; =U; logd; =trlog D=tr A~!(og D)A = tr log M, and we have proved (19). 

Differentiate (19) to get* (det M)~1d det M = (tr log M) = tr(d log M) = tr(M~13M). 


In particular, substituting the metric g,,, for M 


uo» We obtain the desired equality 


1 1 1 1 
Ten = 2g a8 = 590 log g = 38 Ov8uc = Bie (20) 


Using (20), we readily verify that the usual identities involving the partial derivative also 
work for the covariant derivative, provided that the correct integration measure d*x./—g 
is used instead of d*x. For example, from (20), it follows that 


[ atsv=eD,we = / d*x/=8(8,W" + TH W") 


1 
=f atsy g ( ( aia ) weeriw) =0 
(Of course, this also follows directly from the covariant divergence 


1 
DW" = Jap lye) 


which we started this discussion with.) 
Integration by parts also works in the same way. For example, 


[tsv=ex™ DW, = f dbx /=8D.K OW, (21) 


This follows from what we just learned (/ d*x,./—gD,(K*# W,,) =0) and the product rule. 


* By the way, to verify matrix identities of this type, you can always apply my patented “1-2” test: check it for 
1-by-1 and 2-by-2 matrices. 
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Appendix 1: Differentiation along a curve 


Given a curve C parametrized by ¢ and described by X(¢), we might be interested in how various quantities 
defined on the curve change as we move along the curve. An everyday example might be the temperature along a 
highway we are driving on. More relevant to physics, an example might be the direction in which a macroscopic 
object (like a gyroscope) or a microscopic object (like an elementary particle) is spinning, with C its worldline. In 
fact, let us focus on a vector W"(¢), with the word “vector” implying that under a coordinate transformation, 


W'S) = SE (XS) W'S) (22) 


It is important to realize that we are not talking about a vector field W(x), as in the text. Our vector W"(¢) 
is meaningful only on C. For example, the spin of an electron is defined only on its worldline. 

Note also that I did not say that the curve C is necessarily a geodesic. Our gyroscope could be inside some 
rocketship in full throttle blasting by some black hole, for example, and not in free fall. 

The question is how to differentiate W” (¢) along C. By now you have caught on that the naive proposal ee 
is not going to cut it: it does not transform like a vector. From (22) we have 


aw") 
dé 


i 


= SMX (6)) = + (A, SK (X(Q))V* OWE) (23) 


where V*(¢) = eS of is the tangent or velocity vector to the curve C at the point X(¢). If only the first term on 


a. we 


the right hand side “of (23) were present, then would transform like a vector. 


But by now you also know how to fix this Ses Define the covariant derivative oe along the curve by 


DWH) _ dWH) , 


rH (x(c))V4()W” 24 
De dt LXV" (OWS) (24) 


ay 


Using (6), you can check that indeed transforms like a vector: pee = SE (X(S)) one 


In part IX, we will come nas to this covariant derivative along a curve. 


Appendix 2: Lie derivative 


Given a vector field V“ (x), we physicists can readily picture it as the local velocity field of a fluid, albeit in spacetime 
rather than in space. Speaking loosely, we can mentally “fill in” the flow by connecting the “feathered ends” of 
V(x) and construct the trajectories of an infinitesimal fluid element. See figure 1. More formally, integrate the 
first order equation G ax” = V4(X(t)) for X(t). 

Now suppose we ee given a tensor field W."7(x) in addition to V“(x). We could differentiate W’"'(x) by 
comparing its value at two nearby points P and Q. More precisely, let the coordinates of P and Q be x and x, 


Figure 1 A vector field 
visualized as a fluid. 
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respectively; then W’7'(%) — W.7(x) > (% — x)’0,W-"'(x), as Newton and Leibniz taught us. No mystery there, 
we have known this as the ordinary derivative since childhood. 

Sophus Lie now invites us to do something different. Let x be the location that the fluid element at x flows to: 
x > X*# =x" + drV"(x(r)). For pedagogical clarity, let us specialize to the case of a vector field W“ (x) instead of 
W:"(x); you can always fill in the dots on W’"'(x) to your heart’s content later. Lie says, “Going with the flow, we 
‘drag’ W(x) to the point ¥ by regarding x > x“ = x" + dtV"(x(t)) as a coordinate transformation.” In other 
words, we define W(x) = W"(x) am = W"(x) + dtW"(x)d,V"(x(t)). We are now supposed to compare this 
with the vector field W“ evaluated at Q, namely W(x). Got that? 

Lie tells us to do something different from what Newton and Leibniz told us to do: instead of comparing 
W4“(&) with W“(x), compare W(x) and W“(%). So, take the limit W"“(%) — W"(%) ~ dtV"(x(t))0,W" (x) — 
dtW"(x)d0,V"(x(t)). 

To underline what we are doing, speak colloquially for a moment. “Let’s be cool and go with the flow, but 
hmm, somehow this vector we are carrying is not as relaxed as we are and is not pointing in the direction we 
expect; something must be acting on this vector.” This difference, between actual (namely W”(x)) and expected 
(namely W“(<)), is what we want to measure. 

Given two vector fields V“(x) and W“(x), define the Lie derivative of W“ (x) in the direction of V(x) by 


Ly W"(x) = V(x) a, W(x) — W(x) 9, V(x) = V(x) D, W(x) — WY (x) D, V(x) (25) 


In the last step, we replace 0, by D,, which you can verify is allowed, since the Christoffel symbol is symmetric 
in its two lower indices. This should remind you of a similar step in the text when we discussed the commutator 
C =[A, B] of two vector fields. Indeed, the connection between the two discussions should leap out at you: the Lie 
derivative Ly W" is just the 4 component of the commutator [V, W], namely Ly W"“ =[V, W]". Those readers 
who know that Lie algebras are constructed out of commutators (as we saw in chapter 1.3) would not be surprised 
that the same person was responsible for the Lie derivative and Lie algebra. Mathematically, the Lie derivative 
is regarded as being a more “primitive” concept than the covariant derivative, since, as shown in (25), it can be 
defined without referring to the Christoffel symbol. 

The last step in (25) suggests defining the covariant derivative in the direction of a given vector V by 
Dy = V"(x)D,,. (The notation Vy is also often used.) In fact, we now see that, in (24), if W” is a vector field, then 
the derivative along a curve is just Dy W", with V the tangent vector of the curve. 

Misconception alert! We can replace the ordinary derivative by the covariant derivative in [V, W]’ (as shown 
in (25)), but not in [V, W]. In other words, [V, W] 4 [Dy, Dy]. 

Going through the same steps for a vector field with a lower index U,,(x), we obtain 


Ly U(x) = V")0,U, (4) + Ud, V" (&) = V" (x) DU) + U,)D, Vo) (26) 


Various properties of the Lie derivative follow. For example, it satisfies the product rule: Ly(U,,Y,) = 
(LyU,)Y, + U,(LyY,). This allows us to define the Lie derivative of tensors immediately: as in the discus- 
sion surrounding (13), simply pretend that the tensor is a product of vectors with upper and lower indices. For 
example, 


Ly Wan = VP aWay + Wy 9,V" + Wd, V" (27) 


A trivial example is that the Lie derivative of a scalar is just Ly@ = V"0,. 

A tensor Wis said to be Lie transported along a curve if its Lie derivative along the curve vanishes, namely 
Ly W:.; = 0, with V the tangent vector to the curve. To understand physically what the mathematician is talking 
about, think of the curve as your geodesic as you move through spacetime. Set up coordinates so that x° is 
just your proper time and x!, .-- , x4—! are constant along your geodesic. (Indeed, set them all to 0; you are at 
the center of your universe.) Then V4 = ww = (1,0, ---, 0). Look at (27), for example: with this coordinate 
choice, since 4,,V" = 0, we have 0 = Ly W,,, = V"A,W,,3, = OW, The math types make it sound mysterious, 
but a tensor Lie transported along your geodesic is simply a tensor that does not change in time (your proper 
time, that is). 

The curve in the definition of Lie transportation does not necessarily have to be a geodesic: I just pick it as an 
example. It could be a curve in a flow field V(x). People in the New Age talk about going with the flow, but what 
should they do with the vectors and tensors they want to carry with them? Lie transport them, that’s what. 
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Appendix 3: Transforming the Christoffel symbol by brute force 


You can skip this appendix and the next one upon a first reading of the text. They are for those readers who thirst 
for explicit computation, not believing in anything until they have ground it out. 

To work out how ine transforms, you can just plug the transformation law g/,,(x") = 8ar (x)(S~HE (STH 
into (4) and proceed by brute force. It will be convenient in this calculation to keep referring to the collection of 
formulas in the back of the book. 

As I explained, if we could sail the various factors of S~! past the derivatives in (4), P would transform like 
a tensor with one upper index and two lower indices. In fact, we can’t, and there are extra terms involving the 
derivatives acting on S —1 so that we expect 

rit = SH (SHA SHETS, + ME, (28) 
where M}", involves 0S~1. The claim is that Mj, W” will cancel the unwanted term in (2). 

Let us now take a deep breath and churn through this straightforward though somewhat laborious calculation, 
starting from the definition P'’, = 3.847 (9.845 + 9,8, — 9o8;,)- Perhaps you should try to do it first. I should 
warn you though, you have to slog through a swamp of indices. Don’t give up too soon! It is merely an exercise 
in multivariable calculus after all, keeping track of various partial derivatives. 

Focus on a piece of Ty’, (say, ¢“",,2,.), and ask how it transforms. Let’s first suppress indices to get oriented: 
g”'dg.. transforms into ~ SSg*S~'(g..S~!S—1). Looking only at the terms generated when @ hits S~!, we have 
~ SSS~1S~laS~1 ~ S~1aS)ST1. 

I am now going to keep careful track of the indices, which of course makes the calculation seem clunky (and 
confusing when it in fact isn’t). First, 


InBie > 8), = HlBur(S 1), (SG) (29) 


There are two kinds of terms: those in which the derivative hits g,,,(x) and those in which it hits the S~'s. 
Clearly, the first kind of terms (S71 8,80 (S-4)° (S~4"_, those that are present even if 5 does not vary from 
place to place, will take care of themselves. As the discussion after (2) made clear, we should focus on the 
troublemakers, namely the second kind of terms contained in ay Big due to the variation of S, for which we invent 


. epi O} a2.o 
on the spot a double bracket notation {{0; g’,_ }}. To save writing, define K}), = 0; (S ye ay ox aE . Then, 


corresponding to @; in (29) hitting one or the other of the two S~'s in the square bracket, we obtain two terms 
{{8, 8, g}} = Burl Kg, (SY, + Ki, (SY). 

You may not recall instantly, but this has precisely the form of the little lemma you proved way way back in 
exercise [.4.8, namely Hyg = Gyy.g + Gig», Where Ayo = {{9;8),,}} is manifestly symmetric under v + 0, 


and Giy.g = 8urKE(S “yr is manifestly symmetric under A < v. To determine how I}, transforms, we need 


precisely the combination found in that exercise: Hy.44¢ + Ay.o, — Ag eay = 2Gry.g = 2801 Ke(s-hyt. 

Putting it altogether, we obtain {{//}} = ey S7 gP€(2¢,.K2(S})") = SERR, = SHS) OCS), = 
(SH)4 (84, S7D%, = —(S~Y* (A, S)STHE. In the last step, we used the matrix identity (7). This is precisely 
what we heuristically guessed for M ie = (rey. Note that the factor } in the definition of the Christoffel symbol 
is needed here. 

Thus, qrewy = -(s- 5 Ox shyw", precisely the negative of the unwanted second term in (2). So indeed 
we have Dy we (S~4)* SED,W’, and D, W* transforms as a tensor as desired. It all seems a bit complicated, 
but in fact it is simple: as I explained in the text, the two terms in D, W“ each produce unwanted terms, but they 
cancel each other. 

Incidentally, if we write out S, S —l and K explicitly, we have found here that 


rit = ax’ Ox? ax* ax'#  a2xn (30) 
Ox? Ax’* Ax” PF Ax” AxAx' 
which agrees with (6), namely what we had back in chapter II.2, of course. 
Note that the first term is the “uninteresting” part, gathered up from those terms that we said would take care 


of themselves. It tells us how the Christoffel symbol would have transformed had it been a tensor. We worked 
2x" 

Oxax”? 

on the variation of S~1 from place to place. The presence of the second term indicates, as emphasized again and 


again, that the Christoffel symbol is not a tensor. 


hard to obtain the second term, which, as anticipated, depends on the second derivative in other words, 
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Appendix 4: Arguing from the geodesic equation 


The computation in the preceding appendix involves a fair amount of work. In this appendix, we will try to avoid 
work by winging it as much as possible. 

Our starting point is the geodesic X“(t), which we first met way back in chapter II.2, and then again in 
chapter V.1, determined by 


dv? 
ae =O (31) 


with V“ = axe 

First a notational clarification and a bad notation alert! The geodesic is a curve defined by X(t), with 
dt? = —8yy(X (t))dX" dX”. The velocity vector is defined by V(r) = ae . Strictly speaking, it is not correct to 
write V“(X(r)), as some authors sometimes somewhat sloppily do. This notation suggests that there is a vector 
field V“(x) (and there isn’t) defined all over spacetime, and V“(X(t)) is equal to V(x) evaluated at x = X(t). 
In fact, V(r) is only defined on the particle trajectory (be it a geodesic or not). 


Under a coordinate transformation x > x’(x), the ur-vector dx" > dx'# = (axt )dx” = S"(x)dx”. The ve- 
locity V" is most certainly a vector, since dX" — dX’# = S¥(X)dX” is a vector and dt isa scalar. So V(t) = 
S#(X(t))V"(t). Note that the transformation matrix S“ (x) is evaluated at x = X(t). 

There are levels of the game‘ in theoretical physics as in any other substantive endeavor. After we learn the 
geodesic equation (31), we could go on happily solving for geodesics in any given spacetime until we are blue in 
the face. On a deeper level, however, we can ask what (31) teaches us about curved spacetime. 

The left hand side of (31) carries a single free index p and thus looks like a vector. It better be a vector, 
since otherwise what one observer sees as a geodesic would not be a geodesic to another observer. Suppose 
Ms. Unprime writes (31). If the whole package on the left hand side of (31) transforms as claimed, Mr. Prime 


would have ave + re vey = secu +1, V"V%) = 0. The two observers agree that particles move along 
geodesics in spacetime. We have insisted again and again that the laws of physics must transform appropriately, 
and here is yet another example. 

Instead of spacetime, we can talk about curved space (in which case t in (31) would denote length rather 
than proper time). A geodesic curve as the path of shortest distance between two points has intrinsic geometric 
meaning and so cannot possibly depend on the coordinate system we use to describe it. For example, a great 
circle on the globe does not depend on the particular system of latitude and longitude we happen to use because 
of British naval power. The geodesic has to satisfy (31) in all coordinates. 

dv? 


But, for the same kind of reason as in (2), we see that while V? is a vector, G_ assuredly is not. Indeed, 


aes = £ (SP (X(t))V"(z)) = SO(X(t)) LV(z) + (48% (X(z))) V(t). The derivative 4 also acts on the 
transformation matrix S as it varies along the geodesic. Sound familiar? Thus, a would have been a vector, 
had it not been for the second term (Aseyv" = dx" (9, 5°. )V” = (a,S°)V*V". 

As in appendix 3, use the notation {{ wry} = 0, 5° V*V° to indicate the extra term that prevents aye from 
being a vector. Using the notation introduced in (28), we also have for the second term in (31) {{T%,V"V"} = 
MP, S'S” V*V°. Thus, the requirement {{4" + F',V"V"}} =0 gives us (9,.5% + MP, S'S” )V*V? = 0. Here 


comes the nonrigorous part, which renders the argument heuristic. We argue that 0,5¢ + M?, SiS” = 0, even 


though we have only shown that this quantity vanishes when multiplied by V* V° and evaluated on a geodesic. 
We could of course show by explicit and laborious calculation, as was done in appendix 3, that this is in fact true. 
But it is nevertheless highly plausible, since it holds for any geodesic. 

So we feel that it must be true. Once again, I can appeal to what the American Mathematical Society said, 
as recounted in a footnote in this chapter. Multiplying by S~! (and relabeling indices), we have (S ~1)f a, St + 
M.S”. =0. In contrast, looking at (5), (28), and (2), we have {{D, W“}} = ((S~)4,4, 5%, + MES” )W? =0, so 
that as claimed, the covariant derivative D, W“ transforms like a tensor. 

You perhaps see that this heuristic argument is actually quite simple, even though writing it all out involves 
way too many words. 

At the risk of repeating myself, I close this appendix with two important clarifying remarks: 


1. The geodesic equation is guaranteed to transform like a vector, because the action it was derived from 
is manifestly a scalar; in other words, the left hand side of (31) comes from varying an action invariant 
under general coordinate transformation. 
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2. Beginning students are puzzled by the statement “a tensor is something that transforms like a tensor” 
(as mentioned in chapter I.4), because they rarely encounter something that does not transform like a 
tensor. To understand something, it is often easier to understand what would negate that something. 
Here we encounter two objects 0, W“ and I’ W” that are not tensors, even though they have indices 
and everything. When they are added together, their separate nontensorial characters cancel out, kind 
of like in the ideal couple alluded to earlier. 


Appendix 5: A constant vector field 


It is often illuminating to look at important concepts from different angles. In this spirit, let me introduce you 
to another friend of mine, the naive guy. 

He tells us excitedly, “I have discovered a constant vector field. I spent years measuring its 16 derivatives 
a,,W*(x) in my lab and found that they all vanish within experimental error!” 

We explain, “Well, first, your statement is local to your lab, but more importantly, your concept of con- 
stant vector field is special to your coordinate system. In our coordinates, 3) W’ M(x) = (S 2) ODS" W(x) = 
Se SS WwW?) = any (x’)W’? (x’) are not zero.” 

The naive guy lapses into stunned silence. But we soothe his disappointment by pointing out that his concept 
of a constant vector field is still useful but has to be defined as a vector field whose covariant derivatives 
if W(x’) = a, Wx) + ie (x’)W’? (x’) vanish. 

Together with the naive guy, we work out (the third equality follows from (7)) 


Pi) = (SY, BSS = SY 1G, 5S OP, 


ax a%xe 


_ pel? =1A 
= OO V0 = arp axon” 


(32) 
in agreement with (30). This shows that the naive guy’s concept of constancy holds only in those coordinates 
related linearly to his, namely x“ = ax’# + b". Then we have le (x’) = 0. He happens to have chosen locally flat 
coordinates! 


Appendix 6: Lie derivative once more 


It may be worthwhile to approach the Lie derivative from a slightly different direction. In elementary physics, 
you learned about the gradient of a function V f and the rate of change of f in the direction of a given vector ¥, 
namely 3 - Vf. 

Given a vector field V“ (x) anda scalar field ¢(x), we use this notion to define the Lie derivative Ly¢ = V"0,¢. 
We then attempt to generalize this to the Lie derivative of a vector field W(x). 

For our first try, we write down V"(x)d, W(x), but this does not transform properly. In the text, by promoting 
d, to the covariant derivative D, we could make V"(x) D, W* (x) transform like a vector. But another possibility 
suggests itself! W"(x)d,V“(x) also transforms badly, and as described somewhat picturesquely in the text, two 
objects both with character defects could join together to form a couple in which the defects cancel out. Thus, 
we are led to define 


Ly W" = V"0,W" — Wa, VY = —Ly Vv" (33) 


and then to show that it transforms properly. 
Insisting on Ly(W“U,,) = V"9,(W4U,,) (since WU, is a scalar) and the product rule Cy(W“U,,) = 
(LyW*)U, + W"(LyU,,), we are forced to 


LyU, =V"d,U, + Ud," (34) 


Compare the opposite signs in (33) and (34) with the opposite signs in (11) and (12). 
Now that we have (33) and (34), we can define Ly acting on any tensor by pretending that it is composed by 
multiplying vectors together (for example, pretending that T/, is equal to W“U,,Y,) and invoking the product 


rule. Thus, Ly Tyy = V*9,T yy + Tyyd,.V* + Ty, 9)V*. 
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In chapter IX.6, we will pose the following question. Given a metric g,,,, does there exist a vector field €(x) 
such that its Lie derivative acting on the metric vanishes? If so, then €(x) is known as a Killing field. We see that 
the condition for a Killing field to exist is 


Le8uy 8, Suv t Save" t Sunde" = 0 (35) 


We will study this equation in detail in chapter IX.6. 


Appendix 7: Curved spacetime in the lab? 


The subject of this appendix is somewhat off the subject of this chapter, but I wish to introduce you to an 
interesting and amusing area of physics, namely that of constructing analog curved spacetime in the lab. Before 
you say “what?”, note the word “analog.” 

To set up the discussion, consider a scalar field g(x) in Minkowski spacetime governed by the action S = 
—f d4x (5nd, 09,9 + V()), with V(~) some function of g. (You might recall that we first encountered an 
action of this form way back in appendix 2 of chapter II.3, before we even got to special relativity.) To keep the 
discussion simple, we will ignore V(¢) or simply set it to 0. 

In the discussion leading up to (18), we showed the power of the equivalence principle. We obtained the action 
of an electromagnetic field in the presence of a gravitational field by simply replacing the Minkowski metric 7,,, 
in the Maxwell action by g,,,. We now repeat this amazing feat: we obtain the action governing a scalar field in 
curved spacetime, namely S = -5 f d4x./=ge""d,pa,9. 

Phew, that was easy! Exactly: the equivalence principle is powerful stuff. 

Now that we have followed Einstein to the scalar field action in curved spacetime, we follow Euler and Lagrange 


to the corresponding equation of motion, namely 4, ( 35) = 0. This works out to be 


1 

Ga —ggd,g) =0 (36) 
Applying what we learned in this chapter, we recognize this as just D,, Dy = 0, the curved spacetime version of 
the flat spacetime equation of motion 0,,d“y = 0. 
aaa 
On? De? Dy 
of notation here), is just a second order partial differential equation. But second order partial differential equations 
appear in many areas of physics, and some of these could be written as (36) for some g,,,. For example, consider 


Fine—now what? The point is that (36), when written out in terms of , and so on (with a slight abuse 


the Bose-Einstein condensate as a fluid. Its phase angle g(t, x) satisfies a second order partial differential 
equation; when written in the form given in (36), the equation can be interpreted as a scalar field moving in 
curved spacetime. For instance, terms that involve both 2 and 2 correspond to the entry g’* in g“” in (36). The 
name of the game is to set up some flow in the lab (such as that of a fluid rushing down a drain) that corresponds 
to an interesting analog curved spacetime, such as that of a black hole! 

Incidentally, we will come back to the scalar field action in curved spacetime in chapter VIII.4, when we discuss 


inflationary cosmology. 


Exercises 
1 Show that the divergence of a tensor is given by D, T“” = 0,7" + ee i + Le Bi 


2 Given the covariant derivative D, W“ in (5), integrate [ d?x,/— gT/D,W" to obtain the covariant divergence 
of the tensor 7’. Verify that it agrees with the result of (1). 


3 Using the explicit expression for the Christoffel symbol in terms of the metric, show that D,g,,, = 0. Note 
that in contrast 4, g,,, definitely does not vanish. Thus, the metric is a very special tensor: it is the tensor 
with two lower indices that is covariantly constant. Also, note that the condition D, g,,, = 0 can be used to 
determine the Christoffel symbol. Check this for the sphere. Show also that D, g"” = 0. 
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4 Add to the electromagnetic action a term coupling A,, to a current and vary to find Maxwell's equations in 
curved spacetime. 


5 Evaluate the electromagnetic action in the expanding universe discussed in chapter V.3 in terms of the electric 
and magnetic fields. 


6 Define the covariant derivative of a scalar along a curve. 


7 Define the covariant derivative of a tensor along a curve. 


Notes 


1. The question “How do you do?”—perhaps short for “How do you do it day after day?”—could be stated as 
“How do you transform under time translation?” 

2. Incidentally, you will also need this identity repeatedly when doing quantum field theory. 

3. To use a stock market analogy, it is not the change in earnings, but the difference between actual earnings 
and expected earnings, that counts. 


4. J. McPhee, Levels of the Game, Farrar Straus Giroux, 1979. 
5. See the interesting work of Luis Garay, in particular a talk he gave in Leiden in 2007. 


Recap to Part V 


If you hold a ketchup bottle upside down and wait for the ketchup to come out, you are 
applying gravity, but if you shake the bottle or hit it, you are applying the inertial principle, 
though in both cases merely in the Newtonian limit. The truly amazing thing is that the 
second strategy, involving accelerating frames, contains the seed of the first strategy! 

Remarkably, any layperson familiar with airline maps can grasp Einstein’s equivalence 
principle, one of the deepest and most powerful principles in physics. 

The message is that if you know how to change coordinates, you almost know curved 
space and curved spacetime, and once you know how to find “straight” lines in curved 
space, you know how to track the motion of particles in curved spacetime! 

Honoring the fundamental principle that physics should not depend on the physicist, 
we have to understand how things transform. Understanding that, we know how to 
differentiate. 

Interestingly, analog curved spacetimes may appear in an actual lab. 


Part VI | Einstein’s Field Equation Derived and Put to Work 


To Einstein’s Field Equation as Quickly as Possible 


Years of intense longing 


[Now] the happy achievement seems almost a matter of 
course. ... But the years of anxious searching in the dark, 
with their intense longing, their alternations of confidence and 
exhaustion, and the final emergence into the light;—only those 
who have experienced it can understand that. 


—A. Einstein! 


Traditionally, students of general relativity often feel like foot soldiers in the Napoleonic 
army on an interminable march? toward Moscow. After conquering tensors, there is the 
battle of differential geometry, and on, and on. I certainly felt that way. For many, even 
learning Einstein gravity could be characterized as “intense longing, alternating with 
confidence and exhaustion.” In this chapter, I will attempt the pedagogical equivalent of 
airlifting you, given that you now know how to differentiate a vector, directly to Einstein’s 
action for gravity. 

Let me take you to Einstein’s field equation as quickly as possible, starting with what 
you already know. My pedagogical philosophy is to keep things as simple as possible. I 
will necessarily have to take shortcuts, but when I do, I will alert you. I could elaborate and 
expand on each point, but it is better to come back and do that later. Remember, it took 
Einstein 10 years to get there. Rather than deriving everything at once, we will wing it at 
times; you will see what I mean. 

Before we start, let’s take stock of our situation. Thanks to the equivalence principle 
(whose essence I claim that Galileo could have understood), you and I know that gravity 
amounts to curved spacetime. Thanks to how I set up this book, starting with coordinate 
changes leading immediately to curved space, and with the action principle capable of 
incorporating immediately a curved background, you and I have come a very long way 
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indeed. If somebody gives us a curved spacetime, we could jump to it and figure out how 
particles, massive or massless, move in this curved spacetime. 

But you and I do not know how to generate this spacetime. Yes, we know how to use 
symmetry to restrict the form of the metric, as in chapters V.2 and V.3, so that it depends on 
merely one or two functions in the more symmetric cases (a(t) for the expanding universe, 
and A(r), B(r) for the Schwarzschild metric). But how to determine these functions? 

Let’s understand where we are by analogy with what we know already. In Newtonian 
gravity, we first learned in chapter I.1 how particles move in a gravitational field (x). But 
it was not until chapter II.1 that we, suitably inspired by the hanging membrane, figured 
out how to determine ®(x): simply add f dtd 3, (Vo)? to the action, that is, (V®)2 to the 
Lagrangian. In electromagnetism, our present situation with regard to gravity is analogous 
to our knowing, in chapter IV.1, the Lorentz force law telling us how particles move in an 
electromagnetic field, but not Maxwell's equations. This was remedied in chapter IV.2 by 
adding FF" = (0,Ay — d,A,)(0,A,p — 8,A,)n“*n"? to the Lagrangian. 


Searching for something containing two derivatives 


In all these cases, we add to the Lagrangian a term quadratic in the field (® in one case, 
Ay : 
the point particle Lagrangian Im(4)?, As Einstein said, it now “seems almost a matter of 


in the other) and quadratic in derivatives (spatial or temporal). In fact, all this parallels 


course,” at least in hindsight. To describe the dynamics of the gravitational field, we are 
evidently invited to add a term involving two powers of derivative acting on the metric, a 
term that reduces to (V ®) in the nonrelativistic weak field limit. 

The search for actions containing two powers of # has served as a “golden” guiding 
principle in theoretical physics: golden because it has worked? from Newtonian mechan- 
ics to grand unified theory, and because theoretical physicists do not know how to handle 
the inherent instability* in dynamics with higher powers of time derivative, a little known 
instability discovered in 1850 by the Russian M. V. Ostrogradsky. We have already encoun- 
tered this principle as far back as chapter II.3 (see appendix 1), and we will encounter it 
again on a number of occasions. In this chapter we will see that it works for Einstein gravity. 

We also immediately notice some crucial differences. In Newtonian gravity and in 
Maxwellian electrodynamics, the relevant fields @ and A,, vanish* in the absence of the 
gravitational field and the electromagnetic field, respectively. In contrast, in the absence 
of a gravitational field, we know that g,,, reduces to 7,,,. Indeed, we know from our 
discussions in chapters IV.1 and V.2 that in the weak field limit, gog ~ —(1+ 2®). In 


general, let’s write g,,, = Ny.) + h,y; it is h,,, that measures the deviation of the spacetime 


mv 
from flat spacetime and that may play a role analogous to ® and A,,. Furthermore, we 
expect that the inverse metric g"” will also appear in the Lagrangian (since in Maxwell’s 


Lagrangian, 7” already enters, and certainly there is no reason for g“” not to appear). If 


* Up to a trivial additive constant in the case of @ and a gauge transform in the case of A,,. 
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so, since g“” is the inverse of g,,, =, +, and hence an infinite series in the field 
h,,y, we might anticipate that the gravitational Lagrangian we are searching for may be 
considerably more complicated than quadratic in h,,,. (In other words, it may only be 
in the weak field limit that the Lagrangian has the schematic form ~ (dh). For further 
discussion along this line, see chapter IX.5.) 

An important consideration is the restriction imposed by symmetry. In Newtonian 


gravity, rotational symmetry forbids us to write, instead of the familiar and nice 


ee) a) 


something dreadful like 


ab)? an \? an \? ad AD 
a, | — }] +ay(—]) +4,| —]} +45 (= —]+ 
ox dy az ax dy 


with arbitrary coefficients a,,a,,-+-. (In other words, we know from experiments that 


space does not pick out a special direction, and so a, = a), b,y =0, and so forth to a 
high degree of accuracy.) In the case of electromagnetism, we have Lorentz symmetry 
and gauge invariance, which together fix the form F,,, F“”. (I have already emphasized 
in chapter IV.2 that gauge invariance is not a symmetry as such, but a redundancy in the 
description: A,, and A,, + 0,,A describe the same physics.) Thus, for example, we can’t have 
something like 4,,A,,0, Ann"? for the electromagnetic Lagrangian. In the case of gravity, 
the requirement is even more stringent: the action must be invariant under coordinate 
transformations x > x’(x). Again, this indicates a redundancy in the description: different 
coordinate systems can describe the same spacetime. 


Einstein’s search for action and Riemann’s quest for curvature 


At this point, you might suddenly realize that Einstein’s search for an action for gravity is 
more than intimately linked to Riemann’s quest for an invariant or a covariant description 
of curvature. We kept saying in part I that the Riemann curvature tensor must involve two 
powers of derivatives acting on the metric and must transform properly as a tensor. Here 
we want to find an action. Indeed, in this chapter, we will solve both problems at once: find 
the Riemann curvature tensor and the action for gravity. Two birds with one stone! 

In the history of the intersection between physics and mathematics, this realization, that 
gravity and curvature are one and the same, represents one of the most profound insights 
ever. Perhaps we are reminded of the search for a mechanics of motion and the quest for 
a calculus to describe infinitesimal change. There the solution was essentially provided by 
one single person. 

Incidentally, even with all this background, the correct form for what we seek is far from 
obvious. You could challenge yourself by finding it without reading on. The invariance of 
the action under coordinate transformations suggests an expression with two powers of the 
covariant derivative D, and the metric, but we also know that D,g,,, vanishes identically. 
So the correct expression must involve terms with the ordinary derivative acting on the 
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metric but arranged in such a way that the entire package is invariant under coordinate 
transformations. 


Breaking the Newton-Leibniz rule 


After all this preamble, setting the stage as it were, we now roll up our sleeves and get to 
work. Start with what you know: the covariant derivative of a vector with a lower index: 


D,S, = 4,8, —T%,,S, (1) 


vo o 


Let us first go back to elementary calculus. Given the value of a function f(x) and its 
derivative at some point x, then the value of the function at the point x + 5x a short distance 
away is of course given by (1+ 6x £) f(x) ~ f(x + 6x). Think of the operation (1+ 6x 4) 
as a translation operator: it moves or translates the function from the point x to the point 
x + 5x. Generalize to 2-dimensional flat space. Consider a small rectangle with opposite 
corners at (x, y) and (x + 6x, y + dy). To find the value of a function at (x + 6x, y + dy), 
we can translate first in the x direction and then in the y direction: 


(1442) (1+0x5) fa, y= (1+ é + dy ¢ + dxdy a2 
dy Ox Ox dy 


dy 2) f(x, y) (2) 


We apply the translation operator in the x direction, followed by the translation operator 
in the y direction. 

Now let us ask a seemingly pointless question: suppose we translate first in the y 
direction and then in the x direction. We could travel from the corner (x, y) to the 
diagonally opposite corner (x + 5x, y + dy) along the edges of the rectangle in two different 
ways. You say, we would get the same answer, of course. Indeed, the difference between 
the two ways of getting f(x + 6x, y + dy) is equal to 


a a a9 a a 
({t+ar2 tay? toxay 2 21 oy) so = arty, 2] po,» =0 (3) 


where I have introduced the commutator? [ i ; ad = i a 2. In the last step, I used 


the Newton-Leibniz rule that the order of taking derivatives does not matter. 

Now that you have mastered this laughably easy stuff, we are ready to move on to curved 
space. Instead of a simple function f(x), we will translate an arbitrary vector S,(x). We 
apply the translation operator to (1 + (6,x)"D,,), since we have just learned that in curved 
space, the ordinary derivative 9, is to be replaced by the covariant derivative D,,. Here (61x) 
denotes an infinitesimal displacement with components (5,x)” (with v =0,1,---,d). 

Let us now play the same game as before and consider a curved “rectangle” with (5x) 
along one edge, and some other infinitesimal displacement (52x) = (5,x)” along the other 
edge. See figure 1. We translate S,,(x) first along (5;x) and then along (5x). The result 
is then (1+ (52x)"D,,)(1 + (5;x)"D,)S,(x). We then go the other way around, with the 
result (1 + (5,x)”D,)(1 + (62x)"D,,)S,(x). The difference in the two results is then given 
by (62x)“(5;x)"[D,,, D,]Sp(x). Were we in flat space, this quantity would have been zero. 
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65x 


64x 


Figure 1 Displacing a vector in two different ways 
to the opposite corner of a curved rectangle. 


Thus, the nonvanishing of this quantity measures the curvature at the point described 
by x. 

So, let’s compute this commutator of two covariant derivatives acting on our vector S: 
Dy» Dy|Sp = Dy,DySp — DyDSp- 

Before we drown in a sea of indices, let us anticipate the structure of what we will get 
by schematically doing the calculation, suppressing all but the two most essential indices: 
Dye DAlSa (be My +1T",.)S. — (u < v). Remarkably, we will not end up with any 
partial derivative 0 acting on S. First, obviously 0,,,, — (w <> v) vanishes, once again as 


Newton and Leibniz assured us. But what about one power of 0 acting on S? The relevant 
terms are I"), 0,8. +T',.0,S.— (uv), which vanishes. (We will shortly put in all the 
indices and do it more carefully.) Thus, impressionistically, we obtain 


[D., DJS.~ [8.40.47 J8.~ (OF. 400) -Ce))S. (4) 
The result turns out to be S multiplied by a tensor: 


[Dur DylSp = —R®, 5 (5) 


ppv o 


(The minus sign is to conform to convention.) As the use of the capital letter R may suggest, 


R uv 18 the celebrated Riemann curvature tensor. It is manifestly a tensor, since the left 


hand side of (5) is a tensor: D,,D,S, and D,,D,,S, are both tensors, and the difference 
between two tensors is a tensor. 


Riemann curvature tensor 


What are the differential laws which determine the Riemannian 
metric (i.e. g,,,) itself? . . . [The] solution obviously needed 
invariant differential systems of the second order taken from 
Bae We® soon saw that these had already been established by 
Riemann (the tensor of curvature). We had already considered 
the right field equation for gravitation for two years before the 
publication of the general theory of relativity, but we were unable 
to see how they could be used in physics. On the contrary | 
felt sure that they could not do justice to experience. Moreover 
| believed that | could show on general considerations that a 
law of gravitation invariant in relation to any transformation 
of coordinates whatever was inconsistent with the principle of 
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causation. These were errors of thought which cost me two years 
of excessively hard work, until | finally recognized them as such 
at the end of 1915 and succeeded in linking the question up 
with the facts of astronomical experience, after which | ruefully 
returned to the Riemannian curvature. 


—A. Einstein” 


Now let’s compute for real, keeping track of all indices. We learned in (V.6.14) that the 
second covariant derivative is given by 


Dy DySp = 4,(D,Sp) — 1 yDoSp —T7,,DySz (6) 


To calculate [D,,, D,]S,, we are to subtract from (6) the expression we obtain by interchang- 
ing «and v in (6). We notice that the second term is symmetric in yz and v and so can be 
dropped. Next, for D,,S, in the first and third terms, we insert the expression in (1). Thus, 


[Dy» DyISp = 84 (8ySp — P%,,Sn) — 17,(8ySq — Pq Sk) — (<> ¥) 
= 8,9)Sp — (8,0, So — P7859 —P%,p9vSo + TI g Se — (le v) (7) 


Indeed, the calculation is even simpler than shown here if we trust the preceding argument 
and do not even bother to write down any term involving 0.5. and 4.0.S.. But let’s be careful. 

Happily, lots of terms knock each other off when we antisymmetrize in jz and v. The first 
term goes away, and the third and fourth terms go away together, in accordance with our 
earlier “sloppier” argument. Thus we are left with (for convenience, we have interchanged 
the dummy indices « and o in the fifth term in (7)) 


[Dyer DyJSp = -@,F7,,)So +P, 0% Sq — (Ub & V) 
== (G:F, — Pp P (Uv) Sy (8) 
in agreement with (4). Comparing with (5), we obtain the defining expression for the 
Riemann curvature tensor:* 


1 ne a (0 + Piel yp) — Ch ai + ee) (9) 


Note that once we arrive here, we could care less about S,: it is just a convenient crutch. 

In summary, as anticipated, curvature expresses the failure of the Newton-Leibniz rule 
for covariant derivatives.t Since the Christoffel symbol has the schematic form I’. ~ 
g”'d.g.., the curvature tensor R’_. involves" two derivatives acting on the metric, as we 
anticipated, here and as far back as in chapter I.6. 


* In part IX, we will give two alternative derivations of Re uv? One based on parallel transport (and closely 


related to the derivation given here) and one based on geodesic deviation. 

} This derivation has the added advantage that when and if you study Yang-Mills theory, you will see essentially 
the same argument. The Yang-Mills field strength is also given by the commutator of two covariant derivatives. 
Indeed, if you are familiar® with the gauge invariant derivative in electromagnetism D,, = 0,, — iA,,, we also have 
[Du Dy] = —iF yy, the electromagnetic field strength. 

* The curvature tensor clearly involves ~ 00g and ~ dgdg. When evaluating 0.g"", note that g” is the inverse 
of g,. and recall the identity (V.6.7). 
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In a locally flat coordinate system, the Riemann curvature tensor expresses the second 
order deviation of the metric from being flat, as we discussed in part I. In fact, R® _, is the 
only tensor we can form out of two derivatives acting on the metric. 


Professor Flat pops up, declaring, “That’s easy to prove! Suppose there is another tensor 
oO 

. . . Pu i ; 
Since A vanishes in a locally flat system and since it is a tensor by assumption, it vanishes 


with the same properties as the Riemann curvature tensor. Form the difference A 


in all frames.” This uniqueness is of crucial importance when we come to construct the 
action for gravity. 


Symmetry properties of the Riemann tensor 


The Riemann curvature tensor Res is a formidable object, but fortunately it enjoys 
various symmetry properties upon interchange of indices that will make our lives a lot 
easier. Already, from the derivation in (8) and (9), we know that it is antisymmetric under 
the interchange of yz and v: 


Xr _ __ pa 
R pv R pve (10) 


To discover further symmetries, we first let the four indices have, at least nominally, equal 
status. Thus, let us lower the index A: Rypyy = 8, R*,,,y: 

What makes us think that there are additional symmetries? Our fingers. We count. Way 
back when in chapter 1.6, we counted that 20 numbers were required to specify curvature 
in 4-dimensional spacetime. Here we count 4 x 4 x 5(4 x 3) = 96 components in Rzp,1 
thus far, which have to be reduced to 20 by symmetries. 

Professor Flat ambles by again, just in time to save us a lot of work. He says: “Since 
you are looking for symmetry properties of a tensor, you could simply go to a locally flat 
coordinate system around some generic point P, just as in chapter 1.6.” 

So, translate our coordinate system so that P is at the origin x = 0. Expanding, we write 


8ry(X) = Ny + Bee gge a (11) 


where the Taylor coefficient B,,, ,,, is, by construction, symmetric under the interchange 
of ct and yw or of A and o. (Recall from chapter 1.6 that the comma on the quantity B is 
purely for typographical convenience: it helps us separate mentally two sets of indices tu 
and Ao that appear for different reasons.) 

Good! Plug this into es = 58)" (Ip8ry + 9,81 — Ir8py) to find 


iS =a Bigs ope eke Po fis 


As expected, i vanishes at the point P, and thus the expression (9) for the Riemann 
curvature tensor simplifies enormously to . = teen — (u = v). Furthermore, the 4, 
acting on We merely removes x“ from the right hand side of (12) and sets the index « to 


pt in what is left. We thus find easily 


ie < Be a4 + Bry wy — Bov,ur) —(“onv) (13) 
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Note that the middle term in the parentheses is symmetric in wv and hence goes away. 
Lowering the A index, we obtain 


Ryo = Bry up - Bov, ut — (iu > v) = Bry up _ Bov, ur = Bry vp + Bou,ve 
= (Bry, pu + Boy,tv) a (Bov cy a Bru, pv) (14) 
To our surprise, we have 
Ryo = —Rorw (15) 


as we could already have seen from the second term in (14). The antisymmetry of the 
Riemann curvature tensor R_,,,, in the last pair of indices is, as remarked already, obvious 
from its construction, but this antisymmetry in the first pair of indices tp catches us by 
surprise. It was completely obscured in (8) and in (9). (In chapter [X.7, we will give a better 
understanding of this unexpected symmetry.) 

Staring at (14) and remembering the symmetry properties of B.. .., we next discover that 
the Riemann tensor is symmetric upon interchange of the first and second pair of indices: 


Repu = Ruvtp (16) 


Note that (10) and (16) imply (15). In the same way, looking at (14), you can prove cyclicity 
in the last three indices: Rroyy + Rruve + Revpp = 9. 

Professor Flat: “Let me stress it again. Since these symmetry relations are tensor equa- 
tions, they hold in any coordinate system, even though they are derived in a locally flat 
coordinate system.” 

As an exercise, you can show by explicit counting that, in 4-dimensional spacetime, 
the 96 components in R_,,,, indeed reduce to 20 independent components, thanks to the 
symmetry relations just proved. 


Onward to the Einstein-Hilbert action 


The last month | have lived through the most exciting and 
the most exacting period of my life... .1 saw clearly that a 
satisfactory solution could only be reached by linking [the theory 
of gravity] with Riemann variations. 


—A. Einstein, writing to Arnold Sommerfeld, late 1915 


After all this math, let us not lose sight of what we are after: we want to construct an action 
for gravity, an action to describe how spacetime curves under the grip of gravity. The action 
is required to be invariant under general coordinate transformations. It certainly must not 
depend on the observer! Remarkably, this requirement determines the action uniquely. 
As explained earlier in chapter I.5, the coordinate invariant volume element is not d‘x, 
but d*+x./—g where g = det(g..) denotes the determinant of the metric tensor regarded 
as a matrix. (For spacetime, g is negative, hence the minus sign.) Thus, we demand 
an action of the form Sgrayity = f d*x/—g(x)A(x), where A is a scalar (in other words, 
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A'(x’) = A(x) under a general coordinate transformation). For the action to govern the 
dynamics of spacetime curvature, the unknown scalar A should contain two derivatives 
acting on the metric, as already mentioned in chapter IV.2 and as explained earlier in this 
chapter. 


You say, now that we have the Riemann curvature tensor R we could simply contract 


Tp 
all the indices to obtain a scalar, namely a tensor with no indices. This scalar would then 
allow us to construct the action. 

More explicitly, the first step in the process would be to multiply the curvature tensor by 
the metric tensor and sum over indices, obtaining a tensor with two indices. For example, 


we could construct g™“R which is evidently a tensor with two lower indices p and v. 


T, vy? 
We say that we have kotiteacied the indices t and jz. We then repeat the process, multiplying 
by g°” to obtain g°"g* Rip, 

Now we see why we have to study the symmetry properties of the curvature tensor. We 
need to know how many distinct scalars we can construct. 

We want to get to Einstein gravity as quickly as possible, but not any quicker! Pausing 
to study the symmetry properties of the curvature tensor was unavoidable. 

Indeed, some of the possible contractions turn out to give zero. For example, if we 
is anti- 


start by contracting jv, the result g“"R_,,,,, would evidently vanish, since R 


TPL TpLY 
symmetric in wv by construction, while g“” is symmetric. (To see this, note that g“"R 
7 eR iat =—g'"R 


exercise 1.4.5?) 


TOILv 
but an object equal to its negative can only be zero. Remember 


TOV 
But from what we just learned in (15), we can’t contract tp either. We can contract tu 
and define a 2-indexed tensor known as the Ricci* tensor by 


Roy (X) = 8) Rep) (17) 


But this is the only possibility! All the other contractions you can think of either vanish 
(as mentioned above) or give the Ricci tensor again up to an overall sign. For instance, 
BOE is = = BO eit) 
tracting two of the four indices of the Riemann tensor! 


= —R,,. You can get only one single 2-indexed tensor by con- 


Notice that (16) implies that the Ricci tensor is symmetric. 
Now there is only one way, duh, to contract the two indices on the Ricci tensor: 


R(x) = 8?" (X)Rov() (18) 


This second contraction produces a scalar very imaginatively named the scalar curvature 
and also denoted by the letter R. I trust you not to be confused by this standard notation: 
there are three different tensors, all carrying the name R in honor of Riemann (or perhaps 
also Ricci!). The Riemann, Ricci, and scalar curvatures are distinguished by how many 
indices they carry: 4, 2, or 0, respectively. 

In summary, out of the Riemann curvature tensor, we can form one and only one scalar, 
namely R(x). Remarkable! Under a general coordinate transformation, the 20 components 


* Gregorio Ricci-Curbastro shortened his last name when he published the most important paper of his career. 
Perhaps there is a lesson in there somewhere for the reader. 
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(for 4-dimensional spacetime) of the curvature tensor transform into linear combinations 
of each other with coefficients that depend on the transformation, but out of these, only 
one combination remains unchanged. The scalar curvature is the unique scalar we can 
form out of the metric and two powers of derivatives. 

Thus, general coordinate invariance fixes the action for gravity uniquely to be J = 
f d*x,/—g(x) R(x), the spacetime integral of R(x), times some overall constant. 

Let us do some simple dimensional analysis. The action has dimensions of mass 
times length, which we will write as ML. (To see this, simply recall the action for a 
point particle Syoint partie = —m J dt. We are of course using units in which c = 1.) In 
contrast, the metric g.. is dimensionless and R, constructed out of the metric and two 
derivatives, has dimension 7. Hence the integral Z has dimension L* 7, = L?. To obtain 
the action, we have to divide the integral by a constant with the dimensions of 4 to get the 
dimension right. But recall from the introduction that Newton’s constant G has precisely 
dimension ~. 

A highly satisfying fact! Einstein did not have to introduce any? new fundamental 
constants into physics to construct his theory of gravity. The action can only be S = 


aG~' f d*x./—gR, with a some pure number fixed by the requirement that S reduces 


1 


Tez? but we will not need 


to Newtonian gravity in the appropriate limit. As we will see, a = 
this historical number for quite a while. 
The action for gravity, known as the Einstein-Hilbert action after its two discoverers, is 


thus (trumpets please) 


1 
Seu = —— | d*x./—eR 19 
EH loxnG x & (19) 


The Einstein-Hilbert action possesses a wonderfully unique quality. As Ludwig Beethoven 
declared about his composition, “Mufé es sein? Es muf sein.” [Must it be? It must be.] Art 
in its perfection must be a necessity. (See figure 2.) 

We conclude that the action" of the universe is given by 


S = Sey + Satter (20) 


Figure 2 “Must it be? It must be.” (Illustration adapted from Fearful.) 
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where Satter is defined as the action for everything else (for example, including the 
electromagnetic field). 


Einstein’s field equation 


After we finish celebrating our success in deriving the action for gravity, we realize that, 
as Euler and Lagrange remind us, we still have to vary the action S with respect to the 
gravitational field g,,, to obtain Einstein’s field equation. This variation turns out to involve 
a bit of work. For instance, to determine how R varies with respect to g,,,, we have to 
determine how the Riemann curvature tensor varies, and to get that, we have to determine 
how the Christoffel symbol varies. Not that difficult to do, but it involves some set up. We 
will postpone this until chapter VI.5. 

Meanwhile, I will show you that we can get remarkably far doing practically no work. 
First, without sweating through the actual varying, we will give a name to the result. Define 
a tensor with two upper indices K“”(x) by 


azo f ate/—e Rx) = f dtey=@)K™ BE u0(0) (21) 


In varying S in (20), we have to vary Syatter also, of course. Here, in the spirit of getting to 
Einstein’s field equation as quickly as possible, I will take a shortcut. We content ourselves, 
for the moment, with Einstein’s field equation for empty spacetime; that is, we drop Smatter 
from the action, so that we avoid having to vary Satter. With that simplification, the field 
equation is given simply by K“"(x) = 0. 

As we will see, this suffices to derive the Schwarzschild metric and hence the physics 
of the three classic tests and of black holes. 

So, let us try to get away with doing as little work as possible. Even though we don't 
know what K”” is, we know a lot about him. Since in (21), the two derivatives contained 
in R(x) cannot disappear into thin air, K(x) must contain two derivatives, no more, no 
less. Also, K“” is manifestly a tensor, since in (21) 5Z is a scalar and 6g,,,, is a tensor. 

So list all the tensors with two upper indices we know of. The metric tensor g” comes 
to mind (of course), but it does not contain derivatives. However, we could multiply g”” 
by the scalar curvature, which does contain two derivatives, just what we need. So g4”R 
is one candidate. Next up is the Ricci tensor R“”, which also contains two derivatives. 


That’s it. (You might suggest, for example, the tensor R?"°"R,,, but it contains four 


po? 
derivatives.) In other words, K“” must be a linear combination of g“”R and R“”: K#” = 
A(R”” + ag’’R), with A and a two unknown numerical constants we will have to work 
to determine. 


Hence, with almost no work, we obtain Einstein’s field equation in empty spacetime: 
K# = A(R’ + ag!” R) =0 (22) 


The constant A cannot vanish, since that would imply that the proposed action is indepen- 
dent” of the metric. Next, multiplying (22) by g,,,, we obtain (1+ 4a)R = 0. Unless we 
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Figure 3 Einstein writing his field equation. 


are extremely unlucky and a turns out to be — 4 (we will verify soon enough that it is not), 
we can conclude that R = 0. Hence (22) says that 


RY’ =0 (23) 


This is the Einstein field equation in empty spacetime. (In figure 3, we see Einstein 
writing’? this famous equation.) It says that the Ricci tensor vanishes. We speak of the 
field equation in the singular, but in fact it consists of a set of equations according to the 
values taken by jv. It is important to realize that the vanishing of the Ricci tensor does not 
imply the vanishing of the Riemann curvature tensor. Evidently, R,,,(x) = g°° (*)Rryov(*), 
given by a sum over the components of the Riemann curvature tensor, can vanish without 
Reyov(x) having to vanish. Were the Riemann curvature tensor zero, spacetime would be 
flat. Einstein tells us that, in empty spacetime, a particular sum of the various components 
of the Riemann curvature tensor vanishes. 
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We are now in a position to derive the celebrated Schwarzschild metric around a mass 
distribution, study black holes, and even play with the universe under some circumstances. 
See the following chapters. 

I will show by a simple calculation in appendix 1 that a 4 — i: I could have done it here, 
but I did not want to slow us down on our way to the famous field equation for empty 
spacetime. 


What we have done 


Since this has been a lightning fast way—the fastest I know of—of deriving the Ein- 
stein field equation (albeit in empty spacetime), it is worthwhile summarizing what we 
have done.* 

Ever since chapter IV.3, we have suspected that the action for the gravitational field has 
to contain two powers of derivatives acting on the dynamical variable, namely the metric. 
And even earlier, ever since chapter I.6, we have anticipated (more than once in fact) that 
the curvature is given by two powers of derivatives acting on the metric. 

To determine the curvature tensor, we acted with the commutator [D,,, D,] on some 
arbitrary vector field S,(x). This involved two powers of derivatives and the metric tensor 
all over the place, and thus had the structure we were looking for. The beginner might 
have felt a bit overwhelmed by the plethora of indices, but in fact it only took two lines to 
get from (6) to (8). Once we obtained the curvature tensor R®, ,,, we followed Professor 
Flat, as always, to locally flat coordinates and worked out the symmetry properties of the 
tensor, which showed us that there was one, and only one, scalar we can form to put into 
the action. 

Thus was the Einstein-Hilbert action Sy uniquely determined. 

If we dispense with matter for the time being, we don’t even have to do any work varying 
to obtain the equation of motion an = 0. If a certain constant (@ in (22)) does not have 
a particular value, we can wing it and argue by symmetry considerations that the field 
equation amounts to simply R,,, = 0. In the next chapter, we will solve this equation. 


Riemann curvature tensor 


During our headlong sprint to the field equation, we barely noticed that we derived the 
long-sought expression for the Riemann curvature tensor. From way back early in this 
book, already in part I, we have talked about curvature and have sought to calculate it. 
We discussed some intuitive methods. You might recall that one method, if the space is 
a surface, involves marking out a circle and measuring its radius and circumference. In 
another, we have to determine the tangent plane. More generally, we can go to locally 


* After we had worked out the symmetry properties of the Riemann tensor, we could, of course, have argued 
directly from (22) without mentioning the action. See appendix 6 in chapter VI.5. 
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flat coordinates and calculate the quadratic deviation of the metric. All these methods, 
while conceptually clear, are rather unwieldy to implement in practice, adequate mostly 
for curved surfaces only. With this perspective, we can appreciate the power of the Riemann 
curvature tensor. Given a metric, determine the Christoffel symbols (most efficiently by 
using the action principle to obtain the geodesic equations). Then simply plug in (9) to 
obtain the curvature. 

For fun, we can try it on the sphere with ds? = d6? + sin? @dy2. Note from chapter 
II.2 that the only nonvanishing components, T?,, = — sin 6 cos @ and Th, = cost are 
independent of g, and so Rb90 = (80% + reer oy (O19 + Poel Ge) = — (1 G9 + 
Wal aa) = — (A, 083 + (eeey) = +1. We know from chapter I.6 that in 2 dimensions, 
the Riemann curvature tensor has only one component. So all the other components 


must either vanish or be related to the one we just calculated: for example, Rose = 


8°" Reyog = Rovog = Roos = 8yoR oye = sin’? 6. Thus, Rgg = 1 and Rog = sin? 0, giving 


R= go" Rog + 8°? Roy = 2, a constant, as might be expected. 

Best of all, the procedure is straightforwardly algorithmic, and you can easily instruct a 
computer to produce the curvature tensor once you input the metric. Even with my rather 
rudimentary computer skills, I was able to do it. 


Appendix 1: A scaling argument on the way to Einstein’s field equation 


In the text, instead of sweating through actually varying the action with respect to g,,,, we winged it, arguing from 
symmetry and other general considerations that the variation must have the form (21): 6Z =6 f d*x./—gR = 
f d4x./—gA(RYY + ag”” R)5g,,). In this expression, we must take for 5g,,, an arbitrary variation, as Euler and 
Lagrange had instructed us. 

We now give a simple proof that a cannot be — i The trick is to consider a specific, rather than an arbitrary, 
variation, and to pick an especially simple variation, namely g,,,(x) > 8,,)(*) = 2? g,.,(x) with Q a number. In 
other words, we scale* the metric. For an infinitesimal transformation, we write Q? ~ 1+ ¢. Then b8yy (x) = 
Suv) re Suv) = ES yy (x). 

Under g,,,, > eH, we have [ ~ gd... > Q-707g"'d.g.. ~T, that is, T > Pandso RF’. ~ dl +IT > 
R’__, thus leading to R..—> R.. and R= g""R,, > Q-?R. Since g —> 28g (being the determinant of a 4-by-4 
matrix), we obtain Z > 242°T = 2T and hence ST = 22T — T ~ &T, where in the last step we have gone to 
the infinitesimal limit Q? ~ 1+ ¢. However, plugging 58 uy = €8py into (21), we have 


6Z =e i d*x./—gA(RY + ag” R)g,,, =eA(1+ 4a) iy d*x, /—gR = cA(14 4a)Z 


Equating our two results for 5Z, we find A(1+ 4a) = 1. Ta dah, a 4 —i. In chapter VI.5, we will show that 
ok 

Conceptually, it is important to realize that the transformation g,,,(*) > 8,)(%) = a 8, (x) is not a general 
coordinate transformation. Of course not, since the general coordinate invariant object Z actually varies under 
this transformation. But students are often confused, because it looks like a general coordinate transformation. Be 
careful! If we plug the coordinate transformation x/“ = x“ /Q into the general formula 8 o ') = Suv) ue 2, 
we obtain Shy /Q)= 2? g,,,(x). The crucial difference is that the argument on the left hand side is not x, 
but x/Q. 


* See chapter IX.9 for a more extensive discussion of this and related transformations. 
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Appendix 2: A mnemonic 


The expression for the Riemann curvature tensor in (9) is perhaps the most involved you have encountered thus 
far in your study of physics, and so I am moved to say a few words about how it is not that hard to remember. 
Of course, in general, there is not much point to memorizing physics formulas: it is much better to reconstruct 
them when needed, or failing that, to look them up. 
For convenience, let me repeat (9) here: 
o _ o oO Ke io io Ke 
Rv = Ow yy tO yp) — Oy FP ap) (24) 
First, the general structure is clear from our schematic derivation Row ~+T,9+T]~ ar’ +0° 0’. No 
doubt, everybody could come up with a different mnemonic for where the indices go. We need only construct the 
first half of the expression in (24), since we can obtain the second half by interchanging 1 < v in the first half. 
We have three lower indices but only one upper index on Re ny 800 is “king.” In the first term 0.[°., the only 
question is which of the three lower indices on Re av Goes On the partial derivative? It can’t be p, because T° 
is symmetric in jv, but RY, is antisymmetric by construction. It could be wu or v, but “comes before” v in 
oO 
R pnw 


“special.” The first term in (24) is thus uniquely fixed as 0,,P 


and so we pick* him, marking yu as “special.” So of the 4 indices, we have separated two guys, “king” and 
“op 
slots and four lower slots for us to lodge one upper index o and three lower indices pv into. Once we put o in, 
since we don’t have another upper index, the remaining upper slot in T’ I’, has to be occupied by a dummy 
index « to be summed with its lower counterpart: but this lower « cannot be on the same I as the upper x, since 
we know from chapter V.6 that I“, simplifies, and we don’t remember that happening in the derivation. So for 


the second term, we have r'?, I“, thus far. We also know that I“, cannot be I» by symmetry; thus, the indices 


In the second term I’’, T’’., we have two upper 


and v can’t be on the same I. Since y is “special,” he clamors to be with the “king” on the same I’. So we 
arrived at Laie ae Finally, remember to antisymmetrize in the last two indices. 

Anyway, that is how I do it, but the reader might come up with a better way. These days, of course, as I 
said, what I do is simply write an algebraic manipulation program once and for all. If you are marooned on a 
deserted island, probably your best tack would be to go through the simple derivation commuting two covariant 
derivatives. 

In chapters IX.7 and IX.8, I will describe a more powerful method for calculating curvature using differential 
forms. 


Exercises 


1 Given two vector fields W, and U* on the sphere (with p = 0, 9 of course), calculate D,W, and D,U? 
explicitly. As a small check, show that (D,W,)U? + W,(D,U*) is equal to 0,(W,U”). 


2 Evaluate (5) explicitly on the sphere and thus obtain the Riemann curvature tensor for the sphere. 


3 Derive from (14) another important property, namely that the curvature tensor has the cyclic symmetry 
Repay + Ripvp a Rupu =0 


We hold the index 1 fixed and cylically permute the triplet pv. Again, this is hardly evident from (9). Show 
that this imposes d(d — 1)(d — 2)(d — 3)/24 constraints. 


-VJ, = —R°,V, with exercise 3 


4 Combine the definition of the Riemann curvature tensor in (5) V, own = Roy 


SHV 
to show that, for any vector V, 


* Some authors pick v as special because he is last. A word of caution: the curvature tensors used in the 
literature can thus differ by an overall sign. 
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Vou» — Vows + Vi 


V5p5L 


v, Vip = 0 (25) 


i 
ship * Vigwip Lipsy 


We will need this when we discuss isometries in chapter IX.6. 


Show that (5), which says that Vp... — Vow = Ro wo for any vector, can be generalized to any tensor, 
for example, Tyipy — Typww = Ro slop — Rit no: 


Given its various symmetry properties, the Riemann curvature tensor in 2 dimensions has only one inde- 
pendent component R171, with all other vanishing components related to it. Various components, such 
as Ry479, all vanish. Show that all these facts can be summarized compactly by the expression R,,,,, = 
Ry712(8ru8pv — Srv8pp)/g, With g = det g = 211822 — 12821. Contract to find the Ricci tensor and the scalar 


curvature. Show that Ryp,) = ZR(SrpSpv — 8rv8py)- 


In cases where the metric has the form (11), (14) gives an alternative to calculate the Riemann curvature 
tensor. Do this for the example in chapter 1.6 with the metric ds? = dx* + dy* + dz? = dx? + dy? + ((ax 4 
cy)dx + (by + cx)dy)*. Show that it is given by 


Rya12 = 2By2,12 — Bri,22 — B22,11 


Hint: The first term on the right hand side of (11) is By,,99, which is the coefficient of y” in g1,, which we 
can read off as c*. 


Back in chapter I.5, I asked which of the two spaces described by ds* = (1 + u*)du? + (1+ 4v*)dv? + 2(2u 
u)dudv and ds* = (1+ u2)du? + (1+ 2v*)dv* + 2(2u — u)dudv is curved. Now you can answer this question 
readily. 


Consider a (4+ 1)-dimensional spacetime with ds* = Nyydx"dx” + ¢(x)?dy?. Note that the 4th spatial 
coordinate is called y and that the function ¢(x) does not depend on y. Show that the scalar curvature 


is given by R= 26, where [1¢ = Fe ul/=88""9,0) simplifies in the present instance to n“"4,,0,0. 


We will need the result of this exercise in chapter IX.1. 


Petrov notation: group the four indices carried by R;,p,,, into two sets of two, in other words, write R4g with 
the indices A = (tp) and B = (jv) and regard Ryg as a matrix. The various symmetry properties can then 
be interpreted as properties of this matrix. 

(a) Show that the indices A and B each take on 5d(d — 1) values. 

(b) Show that Ryg is a symmetric matrix, and count the number of independent components it contains 
due to this fact. 

(c) Count the number of independent components after imposing the constraints mandated by exercise 3. 
Show that the resulting number of independent components contained in the Riemann curvature tensor 
agrees with that given in chapter I.6. Hint: Check your computation after each step for some small values 
of d. 


Is there a dimension d in which the Riemann and the Ricci tensors have the same number of independent 
components? Hint: The answer is contained in the next exercise. 


The result of exercise 11 suggests that, for d = 3, there exists a relation between the Riemann and the Ricci 
tensors. Using various properties of these tensors, you can practically write down this relation: 


R 


TpyLVv — SruRpv SrvRow Sop Rev t SovRey 3 (SrpSpv _ SrvSpy)R 


Prove this. Hint: Ask Professor Flat for help. 


Given a metric g,,,, let’s construct another metric g,,,(x) = 2? (x) 8 ,y(X). The two metrics are said to be 
conformally related. (Recall that way back in chapter 1.5, you worked with conformally flat metrics in an 
exercise. In the present terminology, a metric is said to be conformally flat if it is conformally related to the 
flat metric.) Show that various quantities calculated for g,,,, can be expressed in terms of the corresponding 


14 


5 


16 


7 


N 


1 
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quantities calculated for g,,, as follows: 


mati = (5 9,2 + 810, 2 — gy,g"?9,2) (26) 
Bigg = Rag — (MIRIY — SELIM Bag ges — gy,giinse) PO 

+ (268505? — 2g, BMS? + Wg QM? — 25HSPSP + gy. 9P?S" — Big 9? S4') Se 
Ruy = Ryy— [(d — 2875? + ay,8”?] Po [2(d — 2)8°5? — (d — 3) gy,” ee 
R= = 2(d — Hg?” Polat (d —1)(d — 4)g°° “ee (27) 


Here we can of course always write D,3,Q more symmetrically as D,D,Q. Some of these expressions 
suggest that we write Q = e®, as some authors prefer. 


For a conformally flat metric g,,,, the results of the preceding exercise are particularly useful, since R!’,, 
and all the curvature invariants derived from g,,,, vanish. Recall that the sphere is conformally flat, with 


-2 
ds* = (1 + £) (dp? + pd’). Verify that the curvature of the sphere is in fact constant using the results 


of the preceding exercise. 


Weyl tensor: the Weyl tensor in d-dimensional spacetime (or space) is defined by 


Cuvpo _ Ruvpo ae (d a 2) "(Bue Rov t BvpRop 8upRov 8v0 Roy) 
+ (d— DE —2)"GyupSov — Buo8pv)R 


(a) Show that the Weyl tensor has all the same symmetries as the Riemann tensor but that in addition it is 
traceless: if we contract any pair of indices carried by the Weyl tensor with the metric, we get nothing. 

(b) Using the result of exercise 12, show that if two metrics g,,, and g,,, are conformally related, C ‘po = 
C'\pq (note the one raised index). It follows that the Weyl tensor vanishes if the metric is conformally 
flat. Hence, the Weyl tensor is also known as the conformal tensor and can be used to test for conformal 
flatness (just as the Riemann tensor is used to test for plain old flatness). 


Recall from chapter V.6 the definition of Dy associated with a vector field V. Show that for three vector fields 
U, V, W, we have 


Dy Dy W* — Dy Dy W* = Dy, y\W* + RY, UHV OW? 


Show that the space described by ds* = y*dx* + x?dy? is actually flat (a) by direct calculation of the Riemann 
curvature and (b) by showing that the metric is conformally flat and then using the result of exercise 13. 


otes 


. A. Einstein, Essays in Science, p. 84. 

. Not to mention the Long March! 

. As explained in QFT Nut, and as we will briefly describe in an appendix to chapter IX.7, the Dirac action 
for spin ; fields provides an interesting “exception,” but not truly an exception, as we could formulate the 
principle more cumbersomely by saying “two or fewer powers” rather than “two powers.” 


4. For a pedagogical discussion, see R. Woodard, arXiv:0907.4238, p. 31. 


wm 


. Recall chapter 1.3 on rotations. 


. In the previous paragraph, Einstein wrote, “I worked on these problems from 1912 to 1914 together with my 
friend Grossmann.” 


. A. Einstein, Essays in Science, p. 83. 
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8. 


12. 
13. 


The appearance of the imaginary unit i indicates that in electromagnetism, the covariant derivative is to be 
understood in the context of quantum mechanics. In contrast, in gravity, the covariant derivative is a purely 
classical construct. You may or may not know that the Schrédinger equation for a (nonrelativistic) charged 
particle in a magnetic field reads 


fe 
ar 


yf ey ee (28) 
2m 


which (as you can see) is obtained from the Schrédinger equation in the absence of the magnetic field by 
turning the ordinary derivative V into the covariant derivative V — iA. The relativistic completion of this is 
evidently 0,, — iA,,. For those readers unfamiliar with (28), here is a quick derivation. Start with the classical 
Lagrangian for a charged particle in a magnetic field L = 4 ea yA. aa. (This is simply the nonrelativistic 
version of the Lagrangian, which can be read off from the action studied in chapter IV.1, with the proper 
time t replaced by time r. We also denote the position of the particle by g to conform to standard usage in 
this context.) The conjugate momentum is then given by p = 24 =m ae As Eliminating 44 = (p+ A)/m 


dq dt 
8a 


in the Hamiltonian (recall chapter III.5) H(p, q) = p- a4 — LG, ay, we obtain H = a (p + A)? Finally, 
we go from classical mechanics to quantum mechanics by setting p > iV, thus obtaining (28). 


. In contrast, in contemporary particle theory, every time we turn around to construct a new action (following 


Einstein’s lead in fact) to explain something or another, we run into what seem like 29 new and hitherto 
unmeasured constants. Imagine what the history of Einstein gravity would have been like if the action 
contained 7 constants and experimentalists had to go out and measure 6 of them. 


. Fearful, pp. 93-94. 
. To me, the Einstein-Hilbert action is just about the simplest, and hence the most beautiful, action in all of 


physics. Of course, as they say, simplicity is in the eyes of the beholder, and you the beholder have to know 
what the letter R stands for. 

In that case, Z would be a topological invariant; we are implicitly assuming that it is not. 

From the back cover of the Japanese translation of my popular book An Old Man's Toy. As far as I know, 
Einstein did not write backward as a matter of habit. The photo was just printed this way, showing that my 
Japanese publisher did not know Einstein gravity. One of my distinguished colleagues quipped that Einstein 
would naturally write in Hebrew. 


VI 2 To Cosmology as Quickly as Possible 


The universe made as simple as possible, but not any simpler 


[| had] again committed, in regards to gravity, something which 
puts me in danger of being shut up in an insane asylum. 


—Einstein, writing to Paul Ehrenfest on February 4, 1917 


Newton was bold enough to apply his mechanics to the celestial sphere where the planets 
reside. But the audacity of Einstein, thinking that the universe could! be described by an 
equation! Turned out that he was right and in no danger of being dragged away by men in 
white coats. 

In the preceding chapter, we derived, with scandalously little work, Einstein’s field 
equation 


Ryy@&) =0 (1) 


in empty spacetime. We first deduced on general grounds, invoking symmetry consider- 
ations with gleeful abandon, what the action for Einstein gravity must be. Then, instead 
of carefully varying the action as dutiful scholars would, we winged it like a lazy southern 
Californian, and arrived at (1). We could now derive the warped spacetime around a mass, 
a star or a black hole, but I will postpone that until the next chapter. 

In the spirit of the preceding chapter, I would like to get you to cosmology as quickly as 
possible. 
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Filling the universe with a constant energy density 


We obtained (1) for empty spacetime. Here, let’s get the universe expanding by filling it 


”* we know of, namely a positive’ constant energy density A. 


with the “simplest stuf! 
It is instructive to go all the way back to the most elementary example of an action, 


namely the action of a nonrelativistic particle in a potential 


1 (dq\* 
Stee ddr me) Sv 
nonrelativistic particle / (: ( at) 


and ask how we would include a constant energy density A pervading all of space. Well, 
the resulting total energy amounts to, duh, { d*xA, thus shifting the potential V > 
V + f d?xA and hence adding to the action the term —/f drt f d*xA. Before Einstein, 
nobody would care about an additive constant in the potential: the equation of motion 
is not sensitive to it. Indeed, the particle does not even know that we added something: 
the dynamical variable g does not appear anywhere in the added term. 

But in curved spacetime, we know that the 4-volume element dtd?x = d*x has to be 
modified to d+x,/—g. 

So yes, gravity knows. No way to sneak around gravity and stealthily add a constant to 
the Lagrangian. 

Thus, we add to the Einstein-Hilbert action Spy the outrageously simple term 


4 
Scosmological = -f d'x/—ga 


as was first done by Einstein, who referred to A as the cosmological constant.* 

In the late 1990s, observational cosmologists discovered that the universe is suffused 
by a mysterious dark energy. The origin of this dark energy remains shrouded in mystery. 
But over the years, observational cosmologists have established, with ever diminishing 
uncertainty, that the density of dark energy is constant in spacetime. Consequently, many 
theoretical physicists believe that the dark energy may well be the fabled cosmological 
constant A introduced by Einstein. As I said, the spirit here is to get you to cosmology 
and our first application of Einstein’s field equation as quickly as possible. Thus, we will 
postpone a more detailed discussion of observational cosmology until later in this chapter. 

Instead, we rush headlong to the action 


— 1 
S= Spy + Scosmological = if d*x./=g (a5 _ A) (2) 


We now hold the action of a universe in our hands. (“Amazing!” I say.) 


* To get to cosmology, albeit a purely hypothetical cosmology, even more quickly than as quickly as possible, 
fill the universe with only the gravitational field. See exercise 1. 

¥ In chapter IX.11, you will learn that a negative constant energy density A would lead to a completely different 
kind of spacetime. 

+ While the cosmological constant may be mathematically simple to describe, it may be one of the most 
mysterious concepts in theoretical physics. See chapter X.7. 


VI.2. To Cosmology as Quickly as Possible | 357 


An equation of motion for the universe 


Instead of Einstein’s field equation in empty spacetime (1), we now have Einstein’s field 
equation in the presence of the cosmological constant, namely 


6S = 6(Sgy + Scosmological) =0 


Dutiful scholars would now vary Scosmological- Actually, that’s very easy to do, and if you 
remember (V.6.20), you already know how to do it. 

But in keeping with the spirit of the preceding chapter, we will keep things easier than 
easy and try to get away with doing as little work as possible. We argue that when we vary 
Scosmological With respect to g,,,, the result must be proportional to Ag“” since there is no 
other tensor around: only the determinant of the metric appears in Scosmological: Also, the 
spacetime derivative d does not appear. Thus, we obtain 


A(R” + ag"’R) = —lomGBAg"” (3) 


where f is some numerical constant, which we will put off computing until chapters VI.4 
and V1.5. 

In the preceding chapter, the right hand side is equal to 0. Here we have sort of the 
next best thing: the right hand side, while not 0, is proportional to g“”. Just as in the 
preceding chapter, we can clean up (3) by contracting it with g,,,. We get A(1+ 4a)R = R= 
—167 G(4B A), where in the second equality we used a result from the preceding chapter. 
The scalar curvature R is some constant times A. Eliminating R in (3), we arrive at 


Ramer (4) 


with A equal to some numerical constant times A. This equation describes the dynamics 
of a universe filled with a constant energy density proportional to A. 


An exponentially expanding universe 


Let us now plug the Lemaitre—de Sitter metric 
ds? = —dt? + a(t)*dx? (5) 


from chapter V.3 into (4). Note that there we studied the properties of the spacetime 
described by (5) for some assumed a(t), but now we are in the much more powerful 
position of being able to determine this function. 

In chapter V.3, we computed the nonvanishing Christoffel symbols to be 


Given this, we could now plow ahead and compute the Ricci tensor R,,,,, plug it into (4), 


and solve for a(t). As easy as pie, actually. But in keeping with our lifestyle, let’s wing it 
first. 
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First, as we have been saying ever since chapter I.5, curvature involves two powers of 


derivatives, so R,,, should involve @ and 4”. Second, consider the coordinate transfor- 


pv 
mation x = Ax for some arbitrary constant 1, keeping t unchanged, then a(t) = A~1a(t). 
But by the transformation law of tensors, Roo = Roo, R, a Rij. So Rog must have the 
form a/a and (a/a)’, and hence the time-time component of (4) gives an equation like 
a/a+ (a/a)? ~ A with various unknown numerical coefficients. 

We see immediately that the solution is a = e”’ with H? ~ A. An exponentially expand- 


ing universe, as discussed in chapter V.3, pops out! 


Solving Einstein’s field equation 


Actually, itis not hard at all to calculate properly like a decent hard-working physicist. Insert 
(6) into the formula for the Ricci tensor obtained by contracting the Riemann curvature 
tensor (VI.1.9): 

Ryy = Roy, = Bo, +02,0%,) — Gyl2, + F205) (7) 


pov KO [LV KV Lo 


Since the spatial slice of our metric is rotational invariant, we already know that Ro; vanish 
and Rj; « 6;;. We obtain 


Roo = — (O02, + P20.) = Exe a 02) ] eer (8) 
a a a 
and 
Ry = (aoT'}; + ioe) -(r po 4 port) = aoa +3- 2) (aid| 8; = (247 + aii)5;; (9) 
a 


Oj” ik kj” i0 


Note that the scaling conditions Roo = Roo Ri =A7R; ; we derived are indeed satisfied by 
the computed R,,,. 
We could now solve (4), consisting of the two equations 


Roo =-3- =—A (10) 
a 
and 
Rjj = (2a? + ati)5;; = Aas;; (11) 


By eyeball, we find the solution a = e”' with H* = A/3. Isn’t it easy? (Actually, I cheated 
you a teeny bit here. You can either figure out my sleight of hand, or wait until chapter V1.5, 
where it will be revealed to you.) 

Proceeding more carefully, we could use (10) to eliminate d in (11), thus obtaining 


a* = H*a? with the two roots 4 = +Ha related by the time reversal transformation t > 


—t. These two equations are solved, respectively, by a = e”' (describing an expanding 


Saat 


universe) anda =e describing a contracting universe). This agrees with the invariance 


VI.2. To Cosmology as Quickly as Possible | 359 


under time reversal of Einstein gravity and more generally of the fundamental laws of 
physics. 

You may have noticed an apparent “miracle” here: we have two differential equations 
for one unknown function and they “happen” to be compatible. In chapter VI.4, you will 
acquire a deeper understanding of this rather mysterious fact. 


Dark energy 


As you have surely heard, and as I mentioned earlier, observational cosmologists made the 
astonishing discovery that the dominant content of our universe consists of a previously 
unknown dark energy. They counted the number of a certain type of supernova (called 
type Ia) that had been established as “standard candles,” namely objects whose intrinsic 
brightness was known. Thus, from its observed brightness, the distance to the supernova 
could be fixed. The observational data indicate, to everybody’s surprise, that the expansion 
rate of the universe is accelerating, an amazing discovery made even more dramatic? by 
the fact that it was made almost simultaneously by two competing teams. 

The expansion history of the universe is determined by its content, as we have seen in 
this chapter and as we will see in more detail in part VIII, and an accelerating expansion 
rate is precisely what is indicated by (10). Thus, the data and Einstein gravity suggest a 
constant energy density. As I mentioned, the cosmological constant provides the simplest 
(and most compelling) explanation. (The alternative is to throw in any number? of scalar 
fields that do not vary in space.) 

Since the original discovery, more observations have been made. Phenomenologically, 
the data are fitted by assuming that the universe is filled with a component described 
by the equation of state of the form P = wp relating pressure to energy density. The 
parameter w = w(z) is taken to be a function of the redshift z. The data indicate that w(z) 
is nearly constant and close to —1. As we will see in the next section, for the cosmological 
constant, w = —1. 

In Planck’s natural units, the observed* A is of order (10~* eV)*. The dark energy 
accounts for something like ~74% of the total mass content of the universe. The other 
26% consists of mostly dark matter with a few percent of intergalactic gas and stars thrown 
in. Thus, the universe we have studied in this chapter, without having to break a sweat, 
could in fact provide a first approximation to our universe. 

A major goal of observational cosmology over the next decade or so is to either establish 
or rule out the supposition that the density of dark energy is in fact constant in spacetime. In 
this golden age of cosmology, we should not be surprised, of course, if observations reveal 
more unexpected facts about the universe, but in this text, to streamline the theoretical 
presentation, we will assume that dark energy could be represented by A. 


* One aspect of the mystery is that ~10~? eV is also roughly the not-yet-understood scale characteristic of 
neutrino masses. Pure coincidence? 
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Negative pressure 


The dark energy has some rather peculiar properties. The popular media often give the 
impression that the strange properties of dark energy somehow have something to do with 
Einstein, whom nobody could understand anyway. In fact, the strange properties follow 
from elementary physics and the statement that the energy density is constant. 

Consider a container (think of a balloon) of volume V filled with some stuff, be it a gas 
or something else. Now squeeze on the container and change the volume by dV. Then the 
energy in the container increases by dE = —PdV, which in fact defines the pressure P 
exerted by the stuff. Note that dV < 0 (squeezing), dE > 0, and P > 0. This of course just 
states energy conservation*: the work you have done by squeezing is —PdV. 

Once we say that the energy density is a positive constant A, then we are immediately led 
to something bizarre. Since E = AV, thendE = AdV <OsincedV <0. ButwithdE <0 
anddV <0,dE = —PdvV tells us that the pressure P must be negative! (Indeed, you could 
see that P is just —A.) So instead of resisting your squeeze, the balloon or container would 
suck your hands in. 

The “rich man” has a fancier but still elementary way of saying this. Recall (I[1.6.16) 
from way back, that the energy momentum tensor of a perfect fluid is given by T“” = 
(0 + P)U“"U” + Pn” in flat spacetime, promoted to T“” = (90 + P)U"U” + Pg” in 
curved spacetime. Since the dark energy is allegedly constant in spacetime, there is no 
U¥ available and so T“” = Pg”. The energy density is thus given by T° = Pg = —P. 
We conclude that, since A is the energy density, a positive A gives a negative pressure 
P =—p=~—A, reaching the same conclusion? as the “poor man.” The rich man might 
criticize the poor man’s approach by asking him or her to find a container to contain the 
dark energy. The material scientists have yet to come up with such a container. 


What is not forbidden is mandatory 


Historically, Einstein added the cosmological constant A to his theory and then later 
removed it. Although Einstein, with his groundbreaking work on the photoelectric effect, 
was indisputably one of the founders of quantum mechanics, he was first and foremost 
a classical physicist. (Such was his greatness that he could be both at the same time.) In 
classical physics, one could include or exclude possible terms in the action as one pleases; 
the goal is to include enough terms to account for observations. 

But quantum physics, with its probabilistic character and constant fluctuation, differs 
profoundly from classical physics. If you neglected a term not specifically forbidden by 


* More generally, the first law of thermodynamics states dE = TdS — PdV, but here we are not increasing 
the entropy S and there is no temperature. 

In chapter VI.5, we will derive T“” = —Ag“” directly from Scosmologicaly using the approach of an “even richer 
man.” 
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a fundamental principle, quantum fluctuations would force that term on you. You must 
have a reason for why a given term should not be there. Roughly, that is because quantum 
physics is probabilistic. Physicists can only determine the probabilities of various processes 
occurring. Any process not explicitly forbidden* will occur, even though the probability of 
the process actually occurring may be very small. 

Thus, in the quantum world, Einstein is no longer allowed to remove the cosmological 
constant A from his theory. 

In part II, we touched upon the notion of a classical field theory, and in part IV 
we studied electromagnetic field theory. Fields exhibit waves schematically of the form 
sin(w(k)t — k -X). In classical physics, the waves could be quiescent. Just as a harmonic 
oscillator in quantum mechanics could never be at rest and thus has a minimum energy of 
shw, a quantum field could never be quiescent and thus contributes a minimum energy 
density ~ fd 3k Sho (k) to spacetime. The puzzle for quantum field theorists is not whether 
A is present in the action, but why observationally it has the value it does. We will come 
back to this in chapter X.7. 


Exercises 


1 Aneven simpler, but less physical, universe than the one described here was discovered by E. Kasner in 1921 
(Am. J. Math., vol. 43, p. 217). Show that the metric ds* = —dt*® + (t??dx? 4 t4dy? t?"dz?) with three 
constants p,q, r solves Einstein’s equation R,,, = 0 provided that p+ q +r= p?+q?+r?=1.The Kasner 


universe expands or contracts at different rates along the three different spatial directions. 


2 The Kasner universe generalizes nicely to higher dimensions. Let’s go to a 5-dimensional metric by adding 
t*°dw? to the ds? in the preceding exercise. Solve. 


3. You might have noticed that the numbers 3 and 2 in (8) and (9), respectively, depend on the dimension of 
space. Determine these numbers for arbitrary dimensions. 


Notes 


1. I must confess that occasionally I am also beset by nameless doubts. See the Closing Words to this book. 

2. For an entertaining and exciting account, see Y. Bhattacharjee’s article “A week in Stockholm” (Science 2012, 
vol. 336, p. 28). The 2011 Nobel Prize in Physics was awarded for the discovery of dark energy. 

3. That’s because (at least in part) scalar fields are free, in the sense that they cost nothing. 


* In T. H. White’s The Once and Future King, the boy Arthur dreams of visiting a kingdom governed on the 
principle that whatever is not forbidden is mandatory. The story inspired the physicist Murray Gell-Mann to quip 
that in quantum physics what is not taboo is a commandment. 


The Schwarzschild-Droste Metric and Solar System 
Tests of Einstein Gravity 


Gravity in empty spacetime 


In empty spacetime (around a star or a black hole for example), we learned in chapter VI.1 
that the Ricci tensor vanishes: 


=0 (1) 


As remarked earlier, since the Ricci tensor is constructed by summing various components 
of the Riemann tensor, this does not necessarily mean that R®,_,,, = 0, which would imply 
that spacetime is Minkowski flat. 

Consider a spherical mass distribution, such as a star, of mass M and radius R. Outside 
the massive lump, we already know what the spacetime metric looks like from chapter V.4: 


81 = —Ar), 8rr = Br), sear Sio=r sin0 (2) 


with all off-diagonal components vanishing. Indeed, we even listed all the Christoffel 
symbols in the appendix there. But back then, we had no idea what A and B were. Now 
we have (1). 

Our job is conceptually straightforward. Compute the Riemann curvature tensor and 
thence the Ricci tensor. Then determine the two unknown functions A and B by solving 
Einstein’s equation (1) with the boundary condition A(r) > (1—- 2GM) and B(r) > las 


rao. 


Newton had his plague and Schwarzschild his heavy gunfire 


As you see, the war treated me kindly enough, in spite of the 
heavy gunfire, to allow me to get away from it all and take this 
walk in the land of your ideas. 


—kKarl Schwarzschild, writing to A. Einstein 
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I hope that you have been enjoying your “walk in the land of ideas,” hopefully in pleasant 
surroundings, without being bothered by heavy gunfire. 

In 1915, the very same year that Einstein published his theory of general relativity, 
Karl Schwarzschild (1873-1916), an officer serving in the German army on the Russian 
front during World War I, wrote to Einstein saying that he had found! the solution for 
the spacetime metric around a spherical mass distribution. Interestingly, in Einstein’s 
celebrated 1915 article, he only found an approximate solution valid for large r (using 
Cartesian coordinates!?), which was in fact adequate for his purpose of working out 
observational tests of his theory. By the way, the family name Schwarzschild means “black 
sign” or “black shield,”* not “black child,” contrary to what many (non-German-speaking) 
students of general relativity believe. Tragically, Schwarzschild died a year later of a painful 
autoimmune disease contracted on the battlefield. 

So, you should be able to work out the solution in the tranquility and privacy of your own 
home. Here is what you do. Since you already have the Christoffel symbols, you simply 
plug in the appropriate formulas and calculate the Riemann curvature tensor. Then sum 
over a pair of indices to find the Ricci tensor. Set the Ricci tensor to 0, and solve for A and 
B. In fact, you could even bypass the Riemann curvature tensor and compute the Ricci 
tensor directly. Do this before reading on and be glad you are not on the Russian front. 


The Schwarzschild solution 


You look up the formula for the Ricci tensor 
Ryo = (0,0, + Baie Ber) _ Ch aan + Bi ie) (3) 


and plug in the Christoffel symbols from chapter V.4. It is a bit tedious but totally straight- 
forward. For example, 


Ry = (051%, a Weel ap = (0%, + rae = AT", an em ee + pa Naee Weee 
AUN Au Bl NA! A’\ (Al 
= ( ) +( ok ) 2 
2B 2A 2B r 2B 2A 2B 
A” A’ A’ B’ A’ 
Crome " 
2B AB\A B rB 


Proceeding in this way, you obtain 


A” A’ Ae A’ B’ 
Ry = an ( a9 ) (5) 
2B rB 4B\A B 
A” B’ A’ A’ B' 
Rate S (245) (6) 
2A rB 4A\A B 
1 r A’ OB’ 
Rog =1 vs 
- B 2B (4 ) ”) 


and Roy = sin? @ Rg. All other components vanish. 
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Again, some features are easy to understand by symmetry considerations. For example, 
consider the coordinate transformation t = a7. Then A = a*A, with B unchanged. That R i 
should transform appropriately gives a check on (5-7). The vanishing of the off-diagonal 
components follows from rotational invariance and the t > —t symmetry of the metric. 

At this point, you might be worried that you have 3 second order coupled ordinary 
differential equations, R,, =0, R,, =0, and Rog = 0 for 2 unknown functions A and B. 
Einstein’s beautiful theory might yet be inconsistent and fall flat on its face. Now you recall 
that a similar worry presented itself in the preceding chapter: there were 2 equations for 1 
unknown function a(t), but then an apparent “miracle” happened and things worked out 
okay. So let’s proceed and see what happens. 

Staring at (5) and (6), you realize that a good strategy is to get rid of the second derivative. 
So form Ku + Kor = + (4 + BY = 0. This instantly solves itselfas AB = 1, where we fixed 
the integration constant by the boundary condition at r > oo. Eliminating A =— B in (7), 
we obtain rg) + ? = 1, with the solution ? =1+ B and so A=1+ 2 The boundary 
condition at infinity fixes the integration constant b to be -2GM. 


An amazing identity? 


It remains for us to hold our breath and plug the solution into, say, R,,. = 0 to see whether it 
is solved. It works! An apparent miracle happens again. Since we physicists do not believe 
in miracles, there must be an amazing identity we don’t yet know about. Be patient. I will 
get to this identity in the next chapter. (Alternatively, you could try to discover this identity!) 
We are in good company, since for many years Einstein did not know about this identity 
either, and this ignorance was one of the reasons that it took him 10 years to get to his field 
equations. 


How does the radius come in? 


Meanwhile, let’s try to imagine the excitement Schwarzschild must have felt in the 
trenches, discovering that the curved spacetime around a spherical mass distribution 
of mass M and radius R is described by this remarkable metric 


r 


2GM 1 
ds? = (1 ) ars ae acu, 40 eae sin” 0d’) (8) 
r 


soon to be named after him.* Einstein was elated when he learned that his highly nonlinear 
field equations had such a simple solution. A priori, A and B could have been two totally 
complicated functions. 

Confusio suddenly speaks up. “Where is the dependence on the radius R>” 


* See appendix 5. 
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Good question! The answer is of course that the solution (8) only holds for r > R. For 
r < R, the empty spacetime field equation R,,, = 0 that we obtained in chapter VI.1 by 
winging it is obviously not valid. If we are talking about the sun, for example, we would 
have to take into account the hot gas in the solar interior. 


The Schwarzschild coordinate singularity 


After admiring this metric for a while, you might start worrying again even if you were not 
born a worrywart. What about g,, vanishing and g,,. blowing up at the Schwarzschild radius 
rg = 2GM? And vice versa for g"“ = 1/g,, and g’" = 1/g,,.2 Indeed, both Schwarzschild and 
Einstein were alarmed by this and thought somewhat confusedly? about this problem.* 
First of all, these metric coefficients, being components of a tensor in a particular 
coordinate system, depend on the coordinate system. Just as the usual spherical coordinate 
is no good at the north and south poles, we could merely have made a bad coordinate choice 
at r = rs. Indeed, for the sphere, we also have chosen (in chapter I.6) coordinates in which 
the metric blows up on the equator. In chapter VII.2, we will show that this is indeed 
the case by exhibiting a set of coordinates, namely the Kruskal* coordinates, in which the 
Schwarzschild solution is not singular at rs. In fact, allow me to remind you that, way back 
in chapter 1.6, we already discussed the distinction between a coordinate singularity and 
an actual or physical singularity, where the geometry itself goes out of control. Remember 
the Einstein-Rosen bridge, a kind of tunnel or “wormhole” between two flat spaces? 
There is a relatively quick way to allay your worry. Let us look at a scalar quantity such 
as?RHVPS R 


r=T1Ps. 


are 12r2 ; ; 
vpo»' Which turns out to be —, with a perfectly innocuous behavior around 


Why look ata scalar? Because scalars transform like S’(x’) = S(x). Thus, ifa scalar blows 


up in one coordinate system, it blows up in all coordinate systems. It follows that ifa scalar 


blows up, then we are in trouble. Tensors, in contrast, can “catch” a singularity going from 
ax! 
ox” 
transformation law. Hence, ifa component ofa tensor, suchas g,,and g"’, blows up in one 


one coordinate system to another: they might get infected by various factors of $+, in their 


coordinate, it does not mean that it will blow up in all coordinate systems. The Mercator 
map is no good at the poles, but it does not mean that the poles have anything singular 
about them. 

You see that it was totally worth it to learn about scalars and tensors! Of course, calculat- 
ing one scalar does not prove anything: we have to check out all the scalars. I won’t digress 
by discussing how many “fundamental” scalars there are in a Riemannian spacetime. But 


verifying that at least one scalar, namely R“”°°R is perfectly well behaved at rg does 


[vpo? 
allay our anxiety a bit. In fact, as we will see in chapter VII.2, the singularity at rg exhib- 
ited in the metric is merely due to a poor coordinate system, of the type of singularity we 


encounter at the poles in a Mercator map. 


* At one point, Einstein thought that a particle falling in would bounce back at r = rs. 
+ The Jargon Guy tells us that this is known as the Kretschmann scalar. 
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Incidentally, that R“”?? Rivpg 
lurks there: Einstein theory as we know it must break down at infinite spacetime curvature. 


blows up at r = 0 shows that a real physical singularity 


Here is a simple but striking example of a coordinate singularity. Transform coordinates 
by plugging x = 2p2 into dl? = dx? + dy? to obtain dl* = (dp”/p) + dy”. The metric 
appears to blow up at p = 0, but that’s merely due to a bad coordinate choice, as you can 
see as clear as day. For p negative we appear to have pulled a spacetime out of a space hat! 
Let w = —p > 0 for p negative, and define t = 2wi, so that® ds* = dl? = —dt* + dy?! Of 
course, p negative is precisely where the original coordinate x makes no sense. 


A whiff of the black hole 


Although the singularity at r =r, is merely due to a bad choice of coordinates, it leads 
nevertheless to important physics, as we will explore in chapter VII.2. Our answer to 
Confusio’s question provides a first hint: if R > rs = 2GM, then we do not have to worry 
about this singularity. But this condition, as we discussed way way back in part 0, is 
precisely that given by Michell and Laplace for the mass M not to be a black hole! 

Meanwhile, we will work out Einstein’s two celebrated solar system tests.* For the sun, 
rs ~ 3 km, which is tiny compared to its radius Ro. This again reflects the weakness of 
the gravitational force. For your information, for various astrophysical objects, the typical 
values of rs/R are given by 10~° (earth), 10~° (sun), 10~* (white dwarf), and 107! (neutron 
star). 


The deflection of light and a factor of 2 


For the deviation of light by the sun | obtained twice the former 
amount. 


—A. Einstein, writing to Arnold Sommerfeld, late 1915 


Newton himself wondered, “Do not Bodies act upon Light at a distance, and by their action 
bend its Rays?” In 1801, Johann Soldner used Newton’s corpuscular theory supposing 
light to consist of a stream of miniscule particles and calculated the deflection of light by 
astronomical objects, thus obtaining the Newtonian value against which we now compare 
Einstein’s value. Recall that you did this very same calculation as an exercise way back in 
chapter I.1. 

History often takes curious turns. In 1911, Einstein, unaware of Soldner’s calculation, 
predicted that light would bend in a gravitational field in his still-evolving theory of gravity. 


* By now, there are a number of other tests. In appendix 2, we discuss radar echo delay. Some tests actually test 
the equivalence principle. For example, Nordtvedt noted that accurate measurements of the earth-moon distance 
would reveal whether the earth and the moon fall toward the sun at slightly different rates. Thus far, Einstein 
has triumphed. Otherwise, you would have heard about it, duh. 
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He followed a naive approach, reasoning that since energy was equivalent to mass, the 
photon could be thought of as having a tiny mass. In hindsight, we see that Einstein would 
simply recover the Newtonian value. In fact, the correct answer is a factor of 2 larger, as 
we will now derive. 

Back in chapter V.4, we had already worked out the motion of light and material particles 
in a static isotropic Einstein spacetime described by two unknown functions A(r) and 


B(r). Now that we know what they are, all we have to do is plug these into the appropriate 
_ 2GM 
= 


Often, it is convenient to use units in which G is set to 1. Then, (V.4.22) in particular 


equations from that chapter, namely AB = 1and B~!=1 


: 4 : : 
gives us Gay +r-(1— 2M) = ar As in chapter I.1, change variable from r to u = 1/r, so 
that r’ = ee = —sul with u/ = ae Then, 
u? +u* —2MW = ee (9) 
=a 


Differentiating, we obtain the “analog Newtonian” equation 
u" +u=3Mu? (10) 


with g playing the role of time. 

Without the M term, this is the harmonic oscillator equation, with the solution bu = 
sin g, which you recognize as just saying that light moves in a straight line. This suggests 
that we treat the M term as a perturbation. Plugging in bu =sin g + buy, we have, 
keeping only first order terms, u{ + u, ~ 3(M/ b*) sin’ y. The solution to this order is then 
bu ~ sin y + (M/b)(2 — sin’ g). Fora light ray grazing the sun, the expansion parameter 
M/b ~ rs/Ro is just the ratio of the sun’s Schwarzschild radius to its actual radius, which, 
as we anticipated, is tiny. 

To work out the deflection, refer to figure 1 and warm up with the no-deflection case (that 
is, with M set to 0) bu = sin gy. Asr > oo, u > 0, which corresponds to y = Oandg = 2.No 
deflection, as expected. With M 4 0, for r > oo, u — 0, the resulting quadratic equation 
for sin g yields two roots. We reject one of these roots as unphysical, with the other root 
being sin g(r = 00) = —2M/b. As the light is coming in from infinity, g = —2M/b, and 
as it is going out to infinity, g = 7 + (2M/b). Thus, Einstein, realizing his error in 1911, 
obtained in 1915 a deflection of Ap =4M/b =4GM,/Ro ~ 1.75", twice the Newtonian 
value. 

Shortly after the end of World War I, the Royal Society financed two expeditions, one to 
Brazil led by Andrew Crommelin and one to Africa led by Arthur Eddington, to observe the 
total solar eclipse* of May 29, 1919, with the express purpose of testing Einstein’s theory. 
If the light from a distant star bends as it glazes the edge of the sun (so that the deflection 
of light predicted by Einstein would be as large as possible), then the position of the star 
would appear to be shifted from its known position. 


* As a naive theorist, Einstein wrote to George Hale, the director of the Mount Wilson Observatory, wanting 
to know “how close to the Sun fixed stars could be seen in daylight” (italics Einstein’s).” Hale explained that 
exploiting a solar eclipse would be more promising. 
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Figure 1 The deflection of light. 


Mercury and heart palpitations 


Imagine my joy at... the result that the equations give the 
perihelion motion of Mercury correctly. For a few days | was 
beside myself with joyous excitement. 


—A. Einstein, writing to Paul Ehrenfest, 1916 


For two centuries after Newton, due to the efforts of greats like Lagrange, Laplace, Bessel,* 
and Le Verrier, planetary orbits were calculated to astonishing accuracy. The perihelion of 
Mercury was observed to advance (as depicted, vastly exaggerated, in figure 2) by something 
like 5,600” (seconds* of arc) per century. After all the known effects (for example, the pull 
of Jupiter accounted for 153”) were taken out, a troublesome discrepancy of 43” per century 
remained. On the basis of a similar discrepancy in the orbit of Uranus, Urbain Le Verrier 
(1811-1877) had triumphantly predicted the existence of the previously unknown Neptune. 
A planet named Vulcan was similarly predicted to orbit between the sun and Mercury, but 
it was never found. 

Then Einstein proposed his curved spacetime, and out pops the 43” per century. Amaz- 
ing! I still find it incredible that this clunk of rock would know, every time it completes 
a revolution around the sun, to move ahead by a teeny bit precisely as dictated by the 
curvature of spacetime. It’s a tribute not only to Einstein, but also to all those celestial me- 
chanicians from Tycho Brahe on, whose massive efforts allow us inhabitants of the third 
planet from the sun to understand the movement of the celestial sphere at this level of 
minute detail. 


* Have you ever wondered what the term “second” used in measuring angles and time has to do with the 
notion of the ordinal number “second”? Well, the first is a corruption of a phrase containing the second. Ptolemy 
proposed subdividing the degree in mapping heaven and earth, and his subdivisions became known in Latin as 
“partes minutae primae” and “partes minutae secundae.” 
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<—-@ 
Figure 2 Mercury’s perihelion advances 
(vastly exaggerated). 


The calculation of the perihelion shift,? while a monument to theoretical physics, 
amounts “merely” to a beautiful exercise in Newtonian mechanics as I have already said, 
and so reluctantly I will relegate it to appendix 1. 

Later, Einstein told his friend Adriaan Fokker!° (1887-1972) that he had heart palpi- 
tations when he got the 43” per century. He also wrote to his friend Sommerfeld saying 
“How helpful* to us here is astronomy’s pedantic accuracy, which I often used to ridicule 
secretly!”!! 


Einstein’s luck 


It is legitimate to speak of a pound of light as we speak of a 
pound of any other substance. . . . | have calculated that . . . an 
Electric Light Company would have to sell’ light at the rate of 
£140,000,000 a pound. 


—Arthur Eddington 


It's Eddington’s deflection of light that made Einstein a worldwide celebrity—the general 
public could hardly be expected to care about the perihelion of Mercury. But space warp? 
Now that’s another story! J. J. Thomson, the discoverer of the electron, presiding over a 
special meeting at the Royal Society convened to announce the result of the solar eclipse 
expeditions, hailed the result as the most important since Newton's work and Einstein’s 
theory as “one of the highest achievements of human thought,” which regrettably, he 
added, was incomprehensible. “No one can understand the new law of gravitation without 
a thorough knowledge of the theory of invariants and of the calculus of variations.” Well, 
dear reader, I gave both of them to you already back in part I and part II, respectively. 

I can now also tell you that it was not a reporter who asked Eddington the question in 
the famous story I recounted in the preface. It was Ludwik Silberstein, a Polish-American 
physicist who had studied Einstein’s theory. According to one account, he was expecting 


* Newton similarly benefited from the labor of Brahe and Kepler. By the way, the Danes say that they’ve had 
a “Tycho Brahe day,” meaning that they’ve had a really bad day. 

+ At 1920 prices. Remember my ranting and raving in the introduction about using sensible units, not 
something like pound per pound. 
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Eddington to name him, Silberstein, as the third. Thus, Eddington’s response could be 
seen as not only arrogant, but insultingly arrogant. Later, Silberstein claimed to have 
discovered a fatal flaw in Einstein’s theory, thus provoking the 1935 Einstein-Silberstein 
debate, which evidently Einstein won. 

Einstein was quite capable of pulling the leg of his friend Max Planck, once remarking: 


[Max Planck] was one of the finest people I have ever known . . . but he didn’t really understand 
physics, [because] during the eclipse of 1919 he stayed up all night to see if it would confirm 
the bending of light by the gravitational field. If he had really understood [the general theory 
of relativity], he would have gone to bed the way I did.’ 


But that was staircase wit on Einstein’s part. In fact, he was almost preternaturally lucky, 
as documented by Waller. After Einstein’s mistaken calculation in 1911, reproducing 
Soldner’s 1801 Newtonian result, there was in 1912 an Argentinian eclipse expedition 
that encountered bad weather. Next, with Einstein still blissfully unaware of his error, he 
convinced his friend the astronomer Erwin Freundlich to organize an expedition, financed 
by the munitions manufacturer Krupp, to observe the deflection of light during a solar 
eclipse in the Crimea on August 21, 1914." Not surprisingly, but fortunately for Einstein, 
the German astronomers, with all their telescopes and financing by Krupp, were promptly 
arrested by the Russians as spies. 

Meanwhile, during the war, Einstein discovered his factor-of-2 error. Without these 
twists and turns of history, his celebrity-making triumph might have been a wet fizzle. 
It has also been suspected that Eddington, an enthusiast for Einstein’s theory, might have 
fudged! the data in Einstein’s favor. He was also an ardent pacifist, like Einstein, and 
might have been eager to show British support for the work of a German citizen. 


Gravitational lensing 


In less than a century, the deflection of light has come a long way, from a minute effect to 
a major tool in our exploration of the universe. As you have no doubt heard, and as was 
mentioned in chapter VI.2, matter in the universe appears to be dominated by an unseen 
dark matter, rather than the luminous matter we know and love, consisting of nucleons 
and electrons. Dark matter, while it does not emit or absorb light, interacts gravitationally 
and thus clumps. Indeed, it is now believed that the galaxies consist of enormous lumps 
of dark matter, each with an island of luminous matter sort of floating inside. Consider 
a distant light source (a quasar, a supernova, a galaxy—it does not matter what). Suppose 
that a large distribution of unseen dark matter is located between us and the light source. 
One way we could detect the presence of this distribution is by how it deflects the light 
from the distant source, known as gravitational lensing. 

For several reasons, I will not go into a detailed discussion of this rapidly developing 
and important subject. Once we work out how a light ray deflects, the rest of the lensing 
calculation involves rather intricate though straightforward trigonometric and algebraic 
equations that have nothing to do with general relativity as such. For comparison with 
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observational data, there are of course numerous effects to be taken into account, such as 
the possibility that both the light source and the dark matter distribution could be moving 
at close to light speed relative to us. I will however make the important and interesting 
remark that the deflection angle Ag increases with decreasing impact parameter b. With 
the kind of lens you are used to, such as the pair inside your head you are using to read 
this sentence, the deflection angle Ag decreases with decreasing impact parameter b and 
thus leads to focusing. Gravitational lensing works oppositely, thus producing in some 
circumstances ring-like images known as Einstein rings. 

I could perhaps close this section by paraphrasing my colleague Tommaso Treu, a 
distinguished practitioner of gravitational lensing, that this exciting subject, including 
Einstein rings, is best appreciated by raising a wine glass filled with a fine white wine 
to the candles illuminating an elegant dinner. 


Appendix 1: Planetary orbit in the Schwarzschild metric 


We start by plugging A and B, as given in (8), into (V.4.15) to obtain (with a tiny notational change) 


2 
>(<) + u(r) = 42-1) (11) 


2 \dt 


with the potential 


1 2 2M 1fi2 2M MPP 
wom alle 2) 0-8) 3b - m 


Recall from (V.4.13) that € is defined by ae = jour: in other words, ¢€ is the value of a at r = 00, namely the 


energy of the particle divided by its mass. 

It is instructive to compare with the Newtonian potential in chapter I.1. The first term in v(r) is the familiar 
centrifugal term, the second is the universal gravitational attraction. Remarkably, going from Newton to Einstein, 
we merely have to add to the potential an extra 4 term. Also, we have £ instead of 4, We now bring what we 
learned in classical mechanics to bear on (11). 

To determine the shape of the orbit r(g) and hence the perihelion shift, we repeat what we did for the deflection 

dr 
of light and define r’(y) = ge = ri =r?i/I1, since, as you recall from (V.4.14), a = 4, with / the conserved 
dt 
angular momentum (per unit mass) of the particle. As in the deflection of light calculation, change variable from 
r tou = 1/r. Plugging all this into (11) and (12), we obtain!® (with u’ = Ht) 
u* +u?—2ou—Auw3=2E (Einstein) (13) 


where we have defined o = M/I?, 4 =2M, and 2E =€? — 1. What we should do is of course compare this 
Newtonian problem with the Newtonian problem you, yes you, solved back in chapter I.1, namely 


Uy + Ug — 20Uy = ewton 
Q+ur—2ouy=2E (N ) 14 


with the obvious solution ug = 0(1+ e cos y). As we discussed in chapters I.1 and 1.4, not to precess is the 
exceptional case, valid only for Newton’s inverse square force. 

We now treat the A term in (13) as a perturbation. (Incidentally, even though you already know that A is tiny, 
you are invited to plug in the numbers and show that 4 ~ 10-8 for Mercury.) Thus, write u = ug + uy, with uw, of 
order A, plug into (13), and collect terms of order 2. We obtain 


2. 


x 
sin gu, + cosy uy = - (1+e cosy) (15) 


At this point, the typical student would fire up the computer and push a few buttons to obtain the solution u,(9), 
and indeed, we could all do exactly that. But it is also rather neat to think through the problem. 
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The left hand side of (15) is linear in u,, with the driving term on the right hand side given by a sum of 1, 
cos 9, cos? yg, and cos’ y. We might expect u to be given by an analogous sum of terms. But we are not interested 
in most of these terms. The constant term in u,, for example, when added to ug, would just shift o by one part in 
108. Periodic terms, such as cos? g, also do not interest us; after p > y + 2, they would just return u = up + uy 
and hence r to the same place. We want aperiodic terms, suchas 9, y sin y, and y cos g. You can see by inspection 
that the first possibility doesn’t work, but the second does (since — sin g(g sin gy)’ + cos yg sin yg) = — sin? y 
and the right hand side contains cos” g) and the third doesn’t. (If you must know, the complete solution to (15) 
has the form u; =a + B cosy + y cos? y 4 Seay sin y, where we do not give a flying nickel about a, B, y, 
which, however, you can determine easily enough.) 


We thus obtain, following Einstein, that u ~ o(1 +ecosy+ reap sin ) ae o(1 +e cos| (1 — jro)o}). 


For r to reach the same value it had at y = 0, we need to have gy = an/(1 _ jro), In other words, the perihelion 
advances by* Ay = 3tA0 = 62(M/1)?. 


Appendix 2: Radar echo delay 


To these two classic tests, deflection of light and perihelion shift, we can now add radar echo delay, proposed 
and pushed through by Shapiro in the 1960s. A radar beam is bounced off the planet Venus, and the time it 
takes for the echo to get back to the earth is carefully measured. The terminology “delay” is unfortunate. Like 
everything else, the photons in the beam get the best possible deal in the curved spacetime around the sun: they 
follow a geodesic of course. So what is the “delay”? The delay is in comparison with what would be expected in 
a Newtonian world. 

By now you should be able to work out this problem by yourself without reading on. Just plug in the appropriate 
formulas in chapter V.4 and in this chapter. A hint: for the two classic tests, we need the expression for dg/dr, 
but for the radar echo delay, we need dt/dr instead. 

So, look up the expression (V.4.21) for dr/d¢ and the conservation law dt/d¢é = €/A(r). Eliminate the affine 
parameter ¢ by dividing. We obtain 


2 2 2 2 
(+) _ AC) (1 ma) = (1 ‘s) (1 b (1 ‘s)) (16) 
dt B(r) r2 r r2 r 


(with the Schwarzschild radius r,; = 2GM, as you may recall). The first expression is general, the second is specific 
to the Schwarzschild solution. As explained in chapter V.4, physics does not depend on € and / separately, but 


2 + : : 
only on b? = c. The radius rg at closest approach to the sun (see figure 3) is determined by a |-=r) = 0. Thus, 


from (16), we find b? = r3/ (1 ‘s) ire (1 + . In the context of this problem, the notion of impact parameter 


is not relevant, and thus we trade b for ro. Evidently, the effect is maximized if the beam gets as close to the sun 
as possible, which occurs when Venus and the earth are at opposite sides of the sun. Thus, in this problem we 


: r2\-3 
have rg, ry > ro > rs, and so we expand a to first order in rg: qt al (1 4) (1 n mn ‘s). 
The time f(r1, rz) for the radar beam to get from ry to rz is given by t(r1, r2) = AN dr ge (by convention, for 
r, > ry, since in the expression for 44 given above we have taken the positive root in (16)). The time for a round 


trip from the earth to Venus is thus given by T (rg, ry, ro) = 2(t(ro, rv) + t (70, rg)). AS a very small check, let 


; ; r2\ 73 : 
us compute the time with rs set to 0: ty(71, 72) = te dr(1 3) “= Vre re Vr? re (for r1, r2 > ro), in 


agreement with expectation. Talk about misleading terminology to call this the Newtonian time against which we 


define the “delay”; we might call it more accurately the Minkowskian time, or perhaps even the Pythagorean time. 
In any case, the time delay is given by AT (rg, ry, fo) = 2(At (ro, ry) + At (ro, rg)) with 


nm re\ 7 243 1 fr— rt J/rt—7 
Arion) =rs [ dr {1-2 a ee ea - (17) 
ro f ar(r +10) 2V r+ro To 


* Recently a friend of mine remarked to me over dinner that anybody who has read a book on general relativity 
knows how to calculate the 43” per century, but how many physicists can calculate the 5,600” per century? Ouch, 
point well taken! 
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E 


Figure 3 Radar echo delay. V stands for Venus, 
E for the earth. 


(While we could evaluate the integral!” exactly, to extract the leading logarithmic term At (ro, ry) ~ rs(log(74/179) + 
- ++) for r1 >> ro, we can simply set rp to 0 in the integral.) I won't bother to put together a final expression for 
AT (rg, ry, fo) for you, but I do wish to remark that, as usual, it takes herculean effort to realize the actual 
experiment with such a tiny effect. We have not even mentioned various necessary corrections, such as the 
propagation of the radar beam through the solar corona. Shapiro was able to verify Einstein’s theory to a couple 
of percentage points, but over the decades, the accuracy has now been improved to about a tenth of a percent, 
using satellites carrying frequency dependent transponders rather than using Venus. 

By the way, just about nothing sets off a stampede of crackpots saying that Einstein was wrong than a 
newspaper report about the latest radar echo delay measurement. The unfortunate word “delay” suggests to 
the uninformed that light does not actually move at the speed of light. You of course know better: (V.4.21) for 
dr/d¢é comes directly from g,,,dx"dx" = 0. 


Appendix 3: Time dependent spherically symmetric mass distribution and 
the Jebsen-Birkhoff theorem 


Jorg Tofte Jebsen in 1921 and George Birkhoff in 1923 showed that, remarkably, the Schwarzschild solu- 
tion continues to hold outside a time dependent spherically symmetric mass distribution. This result, evi- 
dently of great relevance in studying the gravitational collapse of a spherically symmetric dust cloud to form 
a black hole, is known'® as Birkhoff’s theorem in most textbooks. I will walk you through a proof by direct 
computation. 

In exercise V.4.3, you showed that time dependence leads to three more nonvanishing Christoffel symbols, 


namely I’, A re bee x ran ee # , with odd numbers of the ¢ index. This introduces one more nonvanishing 


component in the Ricci tensor: 


B 
oe (18) 
The three components we already had in (5-7) acquire additional terms as follows: 

re dae, ae (4 >) B BAB (19) 
"2B rB 4B\A~ B/) 2B 4B\A. B 

z MFA (445)43 B {AB (20) 
"2A rB 4A\ A BB) 2A 4A\A' B 

1 +r (A B 

Rog =1 21 
me B 2B (4 ) (21) 


and of course we still have Roy = sin? 6 Rog. 
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This looks like a scary mess,* but fortunately here we only want to solve Einstein’s equation in empty spacetime, 
outside the mass distribution. The equation R,,. = 0 tells us immediately that B = 0. Happily for us, we see that, 
in R,, and R,,, A appears multiplied by B and so drops out. 

Alternatively, the equation Rg, = 0 tells us that, since B does not depend on t, (log A)’ = a depends only on 
r. Thus, log A is the sum of a function of r and a function of t, and hence A is the product of a function of r and 
a function of t, namely A= f()(1— 2GM ), with f(t) some unknown function. But we can then simply define 
di = /f (dt to get rid of f(2). 

Thus, the Schwarzschild solution holds outside a spherically symmetric mass distribution, even if it varies 
in time. We will come back to the Jebsen-Birkhoff theorem in chapter IX.4. For now, note that the theorem is 
the general relativistic analog of Newton’s two superb theorems mentioned way back in chapter I.1. To solve 
Einstein’s field equation for the empty spacetime inside a spherical shell, simply go through all the same steps 
as in solving the Einstein field equation outside a spherically symmetric mass distribution, except that when 
you obtain Z =1+ 2 you have to set the integration constant b to 0, since the spacetime must not be singular 
atr =0. 


Appendix 4: Weyl’s shortcut to Schwarzschild 


It is amusing to mention a quick, but not totally kosher, way! to the Schwarzschild solution given by Weyl, which 
Einstein professed in his writings to like.?° 
Weyl simply plugs the Ansatz for the metric (1) into the Einstein-Hilbert action Spy-wey! = f d*x,./—gR to 
obtain an “effective” action Seffective(A, B). Any student who understands the variational principle could tell him 
this is nott quite legitimate. The correct procedure is of course to plug the Ansatz into the equations of motion 
obtained by varying the action. The rigorous mathematical justification”! of what Weyl did took almost a century. 
The determinant of the metric —g = ABr* sin’ 6 is easy. The scalar curvature 


R R 1 1 R R va 
R=eR, = uy (Fe, (x Rye) ) = u4 (fe Ren) 55 
& Kun A B PRA Ot eg oP A B 72°08 (22) 


is evaluated using (5-7). Weyl found that the substitution A = a2b and B(r) = 1/b(r) simplifies the resulting 
mess considerably.* After integrating by parts, Weyl found that the action Sepective(A, B) becomes 


Seu-weyl = 87 / dt / drra ( + a *)] = 87 / dt {f drr(- n'| (23) 
ror 


where in the last step, we integrated f dr rab! = — [ dr (ra' + a)b by parts. In other words, Weyl dropped surface 
terms left and right. In fact, we can drop the integration over ¢ just as we had integrated over 6 and y. Weyl was 
left with the amazingly simple effective action 


oo 
Seffective(a, b) = — [ dr r(1— b)a’ (24) 


Varying Seffective With respect to b gives a’ = 0, and with respect toa gives (r(1— b))’ = 0. Fitting to the boundary 
conditions at spatial infinity gives a = landb=1-— 2M in other words, the Schwarzschild solution. (Recall an 
exercise you did back in chapter II.1.) 

Actually, in my humble opinion, even by committing an illegitimacy, Weyl did not save all that much in 
arithmetic. In chapter VI.4, I will show that, if we are allowed to integrate by parts and throw away boundary 
terms with no questions asked, then we can write the Einstein-Hilbert action as Spy = f d+x./—g[l? op gv — 
(re, BET after considerable formal manipulations. If we are allowed to start with this action and use Weyl’s 
trick, then we can avoid calculating any of the curvature tensors and do save some arithmetical drudgery (which 
in any case we could foist on a computer, not to mention a competent student). 


* Asa check, note that R,, and R,, transform correctly under the scaling t > At, A> A~?A, and B—> B. 

T To find the minimum of f (x1, --- , x,) we should of course solve ue =0,i =1,---,n with the appropriate 
Ansatz, instead of plugging some Ansatz for x,,---,x, into f first and then differentiating. But if both 
the Ansatz and the actual solution possess the same high degree of symmetry, it might perhaps be okay. 

+ You might have noticed that this substitution is designed, with hindsight, to “deal with” the two combinations 


(4 t B’) and (4 a) that appear in the Ricci tensor. 
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Appendix 5: Droste’s solution of Einstein’s field equation 


Amazing though Schwarzschild’s story is, what happened to the obscure Dutchman Johannes Droste (1886- 
1963) is perhaps no less remarkable.2? Droste, who received his doctorate in 1916 with Lorentz in Leiden, solved 
Einstein’s 1915 field equations around a spherically symmetric mass, starting with the preliminary version of 
the field equation published by Einstein in 1913. His work”* was communicated by Lorentz to the Royal Dutch 
Academy of Sciences on May 27, 1916, a few months after Einstein had communicated Schwarzschild’s solution 
to the Prussian Academy of Sciences on January 13, 1916. In my opinion, Droste’s paper is cleaner and less 
confused than Schwarzschild’s, and furthermore contains an analysis of the motion ofa particle in the spacetime. 
Interestingly, Droste also used Weyl’s approach, which we explained in appendix 4 and which Einstein liked so 
much, long before Weyl. For some reason, the physics community totally ignored this work. Droste became a 
high school teacher, and later a professor of mathematics at Leiden University. (While writing this appendix, 
I asked a professor of physics at Leiden University, who said he had never heard of Johannes Droste. He did 
tell me, however, that “Droste” was well known in the Netherlands as a brand of cocoa powder, after which the 
Droste effect [an apparently infinite regression of pictures within pictures”>] was named. He told me of his fond 
childhood memory”? of drinking hot chocolate while being fascinated with infinity.) 

I was quite astonished by this story. Somehow, Lorentz never mentioned his student’s work to Einstein. Or 
perhaps he did, but Einstein chose to promote Schwarzschild, who after all died rather tragically. But why didn’t 
Droste protest every time the Schwarzschild solution was mentioned? Perhaps we simply live in a noisier and 
more assertive era. Here is an interesting tale for a budding historian of physics to look into. 

In the appendices to the chapter, I tell you about, not one, but two young guys getting shafted by the 
establishment. 


Exercises 


1 Calculate the Ricci tensor in terms of A and B. 
2 Calculate the Riemann tensor in terms of A and B. 


3. Show that the Schwarzschild metric can be written in the isotropic form 


1 Guy? GM\" 
ds? = (; a) dt? (: ) (4p? + p2(d0? + sin? ede”) (25) 


2p 


Where is the horizon? 


4 Show that the Schwarzschild metric can be written in the harmonic form 


y— Gu 2 2/44 GM 
civm (DEY a (ae Saar (MY (MERE ea 06 
14+ R R ens 


with R = x*. 


5 Show that in the parametrized post-Newtonian approximation described in chapter V.4, the deflection of 
light is given by Ag = (3) (AQ) ginstein, and the perihelion shift by Ag = (787) (Ag) Einstein: 


6 Show that 
2M dr? 
2 2\ 7,2 2702 
ds =(1 : P) ar (Hs + asi) 
i 
satisfies the Einstein field equation R,,, = —3Ag,,, with a cosmological constant. This is known as the 


Schwarzschild—de Sitter spacetime. We will come back to this in chapter IX.10. 
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7 


Show that in (4 + 1)-dimensional spacetime, the analog of the Schwarzschild solution is given by 


2 A 2 "S . Pi Doe 
ds-= a dt“+ {1 dr“ +r*dQ 
r2 r2 3 


where dQ} is the metric on the 3-sphere S?. 


Notes 


1. 


It is less well known that while on the front, he also wrote a paper on the Stark effect. Perhaps it is only a 
slight exaggeration to say that these days there are professors of general relativity walking around proudly 
ignorant of atomic physics (and professors of atomic physics proudly ignorant of general relativity). 


2. Some would say that this is a highly appropriate name for someone who discovered black holes. 


3. Historically, the horizon was a source of great confusion, and Kruskal’s contribution cannot be overestimated. 


10. 
. Einstein to Sommerfeld, December 9, 1915: “Wie kommt uns da die pedantische Genauigkeit der Astrono- 


For example, on p. 203 of Bergmann’s textbook Introduction to the Theory of Relativity (with a foreword by 
A. Einstein) (1976 Dover edition, originally published in 1942), he quoted Robertson as concluding that “at 
least part of the singular character” of the metric at r = 2GM must be attributed to the choice of coordinates. 
Curiously, people at the time did not follow the modern expedient of simply noting the smoothness of the 
Riemann curvature tensor, which Schwarzschild himself, at the very least, must have calculated. Bergmann 
then went on and cited a paper by Einstein (Ann. Math. 40 (1939), p. 922) purportedly showing that in a toy 
model of a spherical cluster of noninteracting particles, the Schwarzschild singularity could not form. The 
general feeling was that the Schwarzschild singularity could not occur in nature. 


. The second and third sentence in Kruskal’s paper read: “Kasner, Lemaitre, Einstein and Rosen, Robertson, 


Synge, Ehlers, Finkelstein, and Fronsdal have shown that the singularities at r = 0 and r = 2GM are very 
different in character. Their conclusion—that there is no real singularity at r = 2G M—can be demonstrated 
by a choice of coordinates seemingly simpler and more explicit than any introduced so far to this end.” The 
papers cited range from Kasner’s in 1921 to Fronsdal’s in 1959. I am assuredly not a historian, but this 
certainly indicates that after 44 years, the issue of the “spherical singularity” was about to be settled in 1960. 


. Notice also that you calculated only the Ricci tensor (which, being zero, manifestly does not blow up at rs) 


but not the full Riemann tensor, which you can now do as a tedious exercise. 


. This is, of course, Minkowski’s “mystical” substitution x = it. 
. A photograph of this letter (in German) is in the Huntington Digital Library (http://hdl-huntington.org). 
. Most physics students associate Friedrich Bessel with cylindrical coordinates, but in fact his work with Bessel 


functions (actually first discovered by Daniel Bernoulli) was largely in connection with perturbations of 
planetary orbits. 


. In 2012, astronomers discovered a star that has an orbital period of only 11.5 years around the Milky Way’s 


central black hole. This will, in due time, give another test of Einstein’s prediction of the perihelion shift. 
Not to be confused with his cousin the aircraft maker. 


mie zu Hilfe, ttber die ich mich im Stillen frither oft lustig machte!” The Collected Papers of Albert Einstein, 
vol. 8, The Berlin Years: Correspondence, 1914-1918, ed. Robert Schulmann et al., Princeton University Press, 
1998, p. 217; English translation by Ann Hentschel, cited from the companion translation volume, p. 159. 


. Quoted in A. Calaprice, ed. The Expanded Quotable Einstein, Princeton University Press, 2000. 
. J. Waller, Einstein’s Luck. 
. If you've heard the phrase “the guns of August,” which inspired a book with that title, you would know that 


the timing was optimal for Einstein, as it would turn out. 


. For a contemporary study exonerating Eddington, see D. Kennefick, “Testing Relativity from the 1919 


Eclipse—A Question of Bias,” Physics Today, March 2009, p. 37. 


. Note that by differentiating (13) for the perihelion motion, we obtain basically the same differential equation 


as in (10) for the deflection of light, perhaps not surprisingly. 


. Note Newton’s greatness. In calculating the deviation from Newtonian physics, the mathematics we use is 


all Newtonian. 


. Perhaps this is because Birkhoff was a famous professor at Harvard, while Jebsen (1888-1922) died young 


(of tuberculosis) and obscure. This is another example of the Matthew principle cited in chapter III.2. For 
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19. 


20. 


21. 
22. 


23. 
24. 


25. 


26. 


the discovery of Jebsen’s paper, see S. Deser, Gen. Relativ. Gravit. 37 (2005), p. 2251, and N. V. Johansen and 
F. Ravndal, arXiv 0508163. 

H. Weyl, Space-Time-Matter, Dover, 1952; S. Deser and B. Tekin, Class. Quantum Grav. 20 (2003), pp. 4877— 
4883; S. Deser and J. Franklin, Am. J. Phys. March 2005, pp. 261-264. 

“The derivation given by Weyl in his book “Raum-Zeit-Materie” is particularly elegant.” A. Einstein, The 
Meaning of Relativity, Princeton University Press, 2004, p. 94. 

R. S. Palais, Comm. Math. Physics, 69 (1979), p. 19. 

Perhaps ironically one of the leading mathematicians of his time. Those who say that my textbooks are not 
rigorous enough for them, take note! Winging it often turns out to lead to the right answer. 

I am grateful to Gary Gibbons for telling me about Droste during a visit to Trinity College. 

J. Droste, reprinted in Gen. Rel. Grav. 34 (2002), p. 1545. See the historical notes by T. Rothman and by 
C. Beenakker, pp. 1541 and 1543. 

If you search the web for the Droste effect, you will see why it is named after cocoa powder. I was tempted 
to make this an endnote inside an endnote. 

Beats Proust any day. Recall also chapter III.4. 


Energy Momentum Distribution Tells Spacetime 
How to Curve 


The action of the world 


In chapter VI.1, we arrived at the action of the world (whoa!) S = Spy + Smatter. Here Spy 
denotes the Einstein-Hilbert action and Sate, a Sum of various matter actions, such as 
the action for point particles and the Maxwell action for the electromagnetic field. (As 
already mentioned in chapter VI.1, in Einstein gravity, the term “matter” is often used in 
an extended sense to include everything else besides gravity, such as the electromagnetic 
field, which we normally do not think of as matter.) 

We have to vary S = Spy + Smatter With respect to the gravitational field g,,,, to obtain the 
full field equation for Einstein gravity. Thus far, we have avoided* varying Satter. In this 
chapter, we will learn how to vary several different forms of Sinatter- 

First, how do we obtain Synatier, a8 for example the Maxwell action for the electromagnetic 
field in curved spacetime? As explained in chapters V.2 and V.6, when we discussed the 
power of the equivalence principle, we simply’ take the flat spacetime actions we have 
known and loved, such as the Maxwell action, and promote them to curved spacetime by 
replacing the Minkowski metric n,,,, by g,,y. 


The energy momentum tensor once again 


Second, what do we get when we vary Siatter? Write the variation of Satter a8 5 Smatter = 
5 f dtx./=2 T""(x)5g,,)(x). In other words, define 


TH" (x) = 2 5 Smatter (1) 
V8 b8yv(X) 


* In chapter VI.2, we guessed what varying the exceptionally simple Seosmological = — f 4 4x./—ZA would give 
us, rather than actually varying it. 
t See appendix 3 in chapter IX.7 for an exception to this statement. 
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so that the field equation now has the form (recall the notation of chapter VI.1) A(R“”” + 
agt”R)=T""., 

By this point, you understand that 7“ (x) defines a symmetric 2-indexed tensor at every 
point in spacetime. Furthermore, it appears in the field equation for Einstein gravity and 
determines the curvature of spacetime. So what could it be? 

Remember the energy momentum tensor T“” we derived in chapter III.6? Indeed, you 
might accuse me of trying to sneak one past you by using the same notation. Yes, as 
you might have guessed, the T“” here is the curved spacetime generalization of the T”” 
there. Physically, it makes sense that the distribution of energy momentum determines 
the curvature. The energy momentum tensor T“” appears in the Einstein’s equation as 
a source for the gravitational field, just as the electromagnetic current J“ appears in 
Maxwell’s equation as a source for the electromagnetic field. In fact, you can see that what 
we are doing is analogous to what we did in chapter IV.2; one difference is that we have to 
carry one more index around. 

We have arrested the suspect, but how do we convict him? Already, we have mentioned 
a load of circumstantial evidence.* We will now identify T“” for a number of cases, show 
that the integral [|, d’x./—gT°” gives the energy and the momentum contained in the 
volume V, and verify that T”” reduces correctly to the energy momentum tensor we knew 
and loved in flat spacetime back in chapter III.6. Most importantly, we will show the court 
that the T“” defined in (1) is covariantly conserved 


DT 20 (2) 


which reduces correctly to the familiar 0,,7“" = 0 in the flat spacetime limit. If it waddles 
and quacks like a duck, then it is a duck. 

We did not go looking for the energy momentum tensor, but the energy momentum 
tensor came looking for us! 


Energy and momentum of point particles 


Let’s see how (1) works for the simple case of a gas of point particles that do not interact 
with one another, known as dust in general relativity and cosmology. The action reads 


dXi, aX) 


Me 
Radi eta le | de apa 
particles s 7a | "| Suv\4a dt, dt 


a 


(3) 


You have encountered this action in flat spacetime several times already, in chapters III.5 
and IV.2. The equivalence principle again roars with its awesome power: simply replace 


* Another clue comes from the equation for the Newtonian gravitational potential: V?@ = 47Gp. In a 
relativistic theory, the mass density p is replaced by the energy density, and we can’t speak of energy density 
without talking about momentum density as well. This suggests that the right hand side of Einstein’s field 
equation should involve energy and momentum density. See chapters III.6 and IX.5. 
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Nuv by guy and behold the action in its full glory in curved spacetime. Varying Sparticles With 
respect to g,,,(x) and using (1), we obtain 


2 


er eee 
mr7e9) 58 uv) particles 


TY’ (x)= 


dX" dx? , 
5" (x — Xq(Ta)) 


1 1 
a, 2( 5 Mma Aa e dt, dt, 
Bx | dX? dX 
a —8ap(Xa) gee Get a a 


1 dX dX? 4 
= ae Ma / dt, armas (x - Xq(T)) (4) 
—g(x) a dt, dt, 
where as usual the third equality follows from the sensible parametrization of the particle 
Lb v 
worldlines, namely defining t, by setting g,,,,(Xq) ne FE = —1. As we suspected, T“” (x) 


is precisely the curved spacetime generalization of the energy momentum tensor we first 
encountered in Minkowskian spacetime. Setting g,,,, to ,,,, we recover (III.6.7): the “only” 
difference is the appearance of the density factor 1/,/—g, which is precisely what is needed 
to counteract the ,/—g in the volume factor when we integrate over T°” to obtain the energy 
and momentum of the particles. Indeed, this provides a nice formal check of our more 
laborious, but more physical, derivation in chapter III.6. It also shows that the plus sign 
in (1) comes from the minus sign in (3), which was needed (as was first explained back in 
chapter III.5) to reproduce the Newtonian action for a point particle. 


A common sign error 


I must emphasize, once again, that the gravitational field is g,,,, not g“”. The reason is 
that from the very beginning, we defined coordinates to carry upper indices, and so for 
point particles the dynamical variables X“ carry an upper Lorentz index. Thus, particles 
couple to g,,,, not g’”. In varying the action here, we are required to hold the dynamical 
variables X" fixed as we did in (4). Of course, once we obtain T“”, we can lower indices at 
will using the metric and define T,,,, = g,,,T’? 8 pv- 

People sometimes commit a sign error here. For an invertible matrix M, recall from 
(V.6.7) that ’M~! = —M~'5MM~—". Applying this to the metric, we have 


6gh? = — gS a, 8° (5) 


(Note the sign, which we can verify in the 1-by-1 case: 5x~! = —x~76x.) 

Were we to define a T,,,, by varying with respect to g“” in (1), we would have produced 
an erroneous sign. I will show you this schematically, omitting inessential factors. Given a 
matter action S(X""g,,,) = S(Xp¢8°7), we have 6S = S’ X#"Sg_,,, but 6S = S'X,,5g°? = 
SX po (Dg Sg ,,8"" = —S'X""S2.,,. This slightly subtle point is sometimes not made 
sufficiently clear. The point is of course that in varying, you have to specify* what is being 


held fixed. 


* As in thermodynamics. 
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Energy momentum in the electromagnetic field 


Let me now show you the power of this definition of T“” by obtaining results that 

may be familiar to you about the electromagnetic field in flat spacetime. Promote the 
; 4 1 1 

Maxwell action Syanven = f 4*xLMaxwel = —7 f d*x Fy PHY = — 5 f d*xn? Sn? FF, 

for the electromagnetic field in Minkowskian spacetime introduced in chapter IV.2 to 


StMaxwell = —} / d*xy = 9875 9 Fy 3 Feo (6) 


in curved spacetime. Next, vary as in (1) and thus calculate T”” of the electromagnetic 
field. 

The important point here is that A,,, not AY = g“"A,, is the dynamical variable and 
is to be held fixed. One way to see this is to recall that in our discussion (chapter IV.2) 
of electromagnetic gauge invariance, A,, goes with 0, = aia. 
X" and A,, are the dynamical variables. Another important bit of information you should 


So the mnemonic is that 


recall is that, as explained in chapter V.6, the covariant curl is equal to the ordinary curl 
D,Ay — Dy Ay = 9, Ay — 9,A,, 80 that F,,,, does not depend on the metric. 

Here, we have to vary the determinant g = det g,,,, but we did that already back in 
chapter V.6. For convenience, I will repeat it here. Using the identity log det M = tr log M, 
we obtained 5 det M = det M8(tr log M) = det M tr(M~1'5M) and thus 


5/—8 = 3-88" 58 yy (7) 


(Io check, we again go to the 1-by-1 case: 5./—x = 5 —xx— 18x.) 
Now we're ready to vary: 


2 6 
TY’ (x)= S; 1 
J=alx) 88,0) “re 
2 6 


= i d*y/—g(y)87 (yg?) For) Fep(¥) 


4,/—g(x) 88 v(x) 


= FOF" @) = fe" i Or’ @) (8) 


To obtain the last expression, we used (5) and (7). Note the local character of the energy 
momentum tensor. 

The energy momentum tensor (8) of the electromagnetic field contains two pieces. Take 
the trace and watch the results from the two pieces cancel each other: 


T = 8,7” =8,(FL PF — 7a!" FapF) =0 (9) 


Note how various signs play a crucial role here. 
The energy momentum tensor of the electromagnetic field is traceless.* Interestingly, in 
chapter III.6, we derived the tracelessness of the energy momentum tensor of a photon gas 


* Looking ahead, we remark here that this is related to the invariance of the Maxwell action under scale 
transformation. We will discuss scale invariance in chapter IX.9. 
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coming from a rather different direction. This is an example of the “particle-field duality” 
that is at the core of quantum field theory. 


Electromagnetism in flat spacetime 


It is instructive to make contact with what you know about electromagnetism in flat 
spacetime, either from chapter IV.1 or IV.2. Recall that the field strength F,,, is related 
to the electric and magnetic fields by Fo; = E; and F;; = 5Eijk By. 

Well, simply demote g”” in (6) and (8) to n“”. The Maxwell Lagrangian (6) becomes 


1 1 2 2) _ 1/2 _ 72 
L= 9h FY = 72 ) Fo; + ) ) FiJ= 7(E* — B°) (10) 
i i oy 


where for clarity, I have reinstated the summation sign. Next, work out the different compo- 
nents of T“” and compare with what you know, either from a course on electromagnetism 
or with exercise IV.2.2. The energy density is given by 


T = Ty = + FOF + LC = +E? — 3(E? — B?) = 5(E* + B?) (11) 


(Note that in Minkowski spacetime T and To) are numerically the same.) That’s com- 
forting to see an energy density we’ve known from “childhood”: the energy density is a 
rotational scalar to which the electric and the magnetic fields contribute equally. 

Contrast the signs in (10) and (11). The signs work just as in Newtonian mechanics, 
where the energy is the sum of the kinetic and potential energies, while the Lagrangian 
is the difference. In electromagnetism, the electric field plays the kinetic role, while the 
magnetic field plays the potential role. Indeed, the electromagnetic field may be regardedas 
a collection of an infinite number of harmonic oscillators. This view provides one possible 
starting point for quantum field theory. 

Next, calculate the momentum density 


To; = Fo, F,* = Fo; Fi 


i ij = 


6;jnE By = (E x B); (12) 


The Poynting vector you learned in electromagnetism has just emerged! It is the simplest 
rotational vector you can form out of the electric and the magnetic fields with the correct 
transformation properties under reflections in space and in time. 


Interaction among different matter sectors 


Thus far, we have treated Syartices and S in turn. But now consider the action 
particles Maxwell 


S, 


particles + Smaxwell + Sinteraction, Where, as we first saw in chapter IV.2, 


dXtt 
Sinteraction a De ea / dTq ra ae (13) 
a a 


with e, the charge of particle a. All we have to do is to plug this action into (1) and turn 
the crank. 
“But I don’t see g,,, anywhere in (13),” Confusio pipes up. 
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Indeed, for once Confusio is right: perhaps surprisingly, Sinteraction does not contribute to 
T"” according to (1). Thus, for charged particles interacting with one another, the energy 
momentum tensor that the gravitational field responds to is just the sum of the energy 
momentum tensors in (4) and (8): 


pee tT 


particles Maxwell 
1 dX" dx” 1 
=— Lime f at 5 7 5 4 34x —X,)— FUR + 58. FoF? (14) 
V8 a Tq Tq 


However, as we would expect physically, since the charged particles and the electromag- 

: pv pv 
netic field can exchange energy and momentum, the tensors T oticles and Tyiaxwell 2f¢ NO 
longer separately conserved. Only T“” is conserved.! We will reach a deeper understanding 


of this point in appendix 1. 


What exactly are energy and momentum, anyway? 


You started studying physics by learning about mass, energy, momentum, all sorts of great 
stufflike that. But what exactly is energy and momentum anyway? I want to emphasize here 
that (1) gives us a fundamental definition of energy (and hence mass) and momentum. 
Energy momentum is what the graviton listens to, just as electric charge is what the photon 
listens to. (The graviton is of course the particle associated with the field g,,,.) In other 
words, energy momentum as embodied in T“” is what appears in the right hand side of 
Einstein’s field equation: it is the stuff that tells spacetime how to curve. 

As is evident from the electromagnetic example, this definition is useful even if we are 
not interested in curved spacetime per se. Given an action in flat spacetime, we can always 
temporarily promote ,,, to g,,,, multiply d*x by ./—g, use (1) to find T“"(x), and then 
set g,,, back to the Minkowski metric 7,,,. We are guaranteed, as we will show shortly, to 
obtain an energy momentum tensor satisfying D,,T"(x) =09, and hence 0,T"(x) =0 
in flat spacetime. In contrast to a formula like Ex = 5m v’, the definition (1) of energy 
momentum can be applied to any theory based on the action principle, such as quantum 
field theory.” You will explore this further in exercise 4. 

More importantly, this fundamental definition of the energy momentum tensor leads us 
to a deep understanding of why energy and momentum are conserved. As the derivation is 
a bit long, I place it in appendix 1, where I will show that the conservation law D,,T"” = 0 
follows elegantly from the principle of general invariance. We would expect T“” to have 
“nice” properties; it is the variation not of any old piece of junk, but of an exquisite object 
that controls how things move and that does not change under coordinate transformations. 


Appendix 1: Conservation of energy momentum 


As I emphasized way back in chapter II.4, conservation laws and symmetries are intimately connected. Here we 

will exploit the general invariance of the matter action S,patter to prove energy momentum conservation. We could 
F ae ‘ as 4 4 

discuss this in the abstract, but just to be concrete, let us specialize to Saxwe = —q J d*x./—887' 9’? Foy Fep: 


384 | VI. Einstein’s Field Equation Derived and Put to Work 


In other words, we are taking the action for the world to be S = Spyz + Spaxwel], Namely a world with only gravity 
and electromagnetism, and nothing else. 

Be forewarned. The following derivation may appear rather long to the novice, but it is completely general 
and hence quite profound. Every step may seem trivially true, but that might well be the way of the Zen master. 
Don’t be lulled to sleep, and watch the primes, and even more importantly, the absence of primes, like a hawk. 

General invariance means that Syaxwell remains unchanged if we make these replacements: 


XX, Bog (X) > Bi g(X')s BMT (x) > BP (X'), Ap(x) > AL’) (15) 


This may seem like a long list, but you understand that all we are doing is making a coordinate transformation. As 
always, g/,,(x") = 8uv(x (SI) (S- 1), gH (x!) = SH S% BPF(x), A(x’) = Ay(x)(S-M, with S4 = 2 and 
(Sy, = 
po axle 
We presently specialize to an infinitesimal transformation x’ = x" + e"(x) so that, to leading order, S%¥ = 
5h + dye" (x) and (s-y’, = a — d,e"(x). Then Ai’) = Ay(x)(S7 4, = A,(x) — A,(x)d,e"(x) and 


Bog) = Buvlx(S YS), = Spa (*) — Sua (*)9,8"(X) — Spy(x)9q8"(X) (16) 


Keep that in the back of your mind. 
After we make those replacements in (15), we end up with 


1 
Sytaxwell = — 7 / dx! J—g'@)a' (x9? FL, F(x’) 


(with Fi) a a AL) — aA, (x’)). We have exactly the same Styaxwell We started with, and so 5S\axwell = 0. 

Looks like we did nothing! We have merely verified that Syfaywe) is invariant under general coordinate 
transformation. But the magic trick is about to begin. 

Since x’ is adummy integration variable, we can erase all the primes on x’ in that integral for S\ja,wet displayed 
in the preceding paragraph. So go ahead and do it. I will wait for you. 

You didn’t erase the prime on g’ and A’, did you? 

If you did, you need to review your calculus. The dummy x’ can be replaced by anything you want, in particular 
x. But of course you and I have no right to erase the primes on the dynamical fields 8, > and A’, . (Here and 
henceforth, all statements made about g/,, also apply mutatis mutandis to g’“°, which after all just denotes the 
inverse.) In other words, S\axwell = — - f dbx./—g'(x)g'§ (x) g(x) FL, (x) F., (x). The net effect is that we have 
replaced A,(x) > Al (x) = A/(x’) — (AU, (x) — AL (x)) and 85508) > 87,4 (8) = Big (*') = (8), 8) = 81,5 (4). To 
leading order in ¢, we made the change 


58 pq (x) Prom = BZ) — Spa (X) = {Bq (2) — Bpa(2)} — {8pq(X") — Bog (@)} 
== (8po XApEM() + Bpult)0oE"X) + £*9; 8p0(4)) + OCC) (17) 


In the last step, we simplified the first bracket using (16), which I told you to keep in the back of your mind, and 
the second bracket using calculus to the order indicated. We put a superscript on 5g,,,(x) to remind us that this 
is a specific variation of the metric, given by the specific form in (18), not a general variation. 

Recalling chapter V.6, you might have recognized that we can write this in terms of covariant derivatives of e”: 


ee =—(Ep:o + £¢;9) + O(6”) (18) 


(Incidentally, the expression inside the parentheses could also be written in terms of the Lie derivative 
L, 800 ) : 

Similarly, we can work out the change 6A, (x) Specific = A’, (x) — A, (x) to be some specific expression involving 
é€ and its derivatives, the same kind of expression as in (18) but, as you will see, we don’t need to know the 
specific form. 

As I said, everything seems trivial step by step. But now let us put the pieces together: the variation 5 SMaxwell 
consists of two terms, one due to the variation 5A RCO aces the other to 5g ie (x) SPecific Since we also know that 
5 S\axwell Vanishes by general invariance, we obtain 


0 = 8Ssaxwell = / d*x(-+ 5A, (x) Pe + 3 / dt JHB TPO (1) 88 ye (x) PCC (19) 


by virtue of (1). 
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In the first term, the expression denoted by (- - -)? vanishes. Indeed, according to Euler and Lagrange, that is 
how we derive the equation of motion for the electromagnetic field A, (x). We vary the action Swaxwell With respect 
to a general variation of A,(x) and set that variation of the action to 0 to obtain the equation of motion. If this 
variation of Syaxwell Vanishes for a general variation 5A,(x), then a fortiori it vanishes for the specific variation 
6A, (x)sPecifc, This point is worth emphasizing: the rest of this derivation can proceed only if the electromagnetic 
field satisfies Maxwell's equations. But of course: the energy momentum tensor (8) of the electromagnetic field 
can hardly be expected to be conserved if the electromagnetic field is not going to vary in time according to 
the rules! 

To obtain the second term in (19), we use the definition (1), substituting for 5g,, the specific variation 
8806 (x) specific in (18). We thus conclude that 0 = f d*x./—g T°" (x) 6855 (x)SPecific, 

Confusio looks puzzled. We knew he would! 

“You mean (1) is just zero?” 

“No! The statement in (1) says that if you vary Smatter by varying g,,, arbitrarily, the coefficient gives you T“”. 
It is the definition of T“”,” you and I say in unison, “but the statement we just derived says that the variation of 
Smatter Vanishes for a specific 5g,, (x) specific as given in (18).” 

So, we just derived 


i, d*xJ=8 TT (Ep.g + €:p) =0 (20) 


The expression in the parentheses in (20), since it is multiplied by a symmetric tensor, can be replaced by 2¢,.,. 
A (covariant) integration by parts immediately gives (since ¢“(x) is arbitrary) what we set out to prove: 


D,T’ =0 (21) 


For the discussion in the next appendix, it is also illuminating to write (21) as 


1 a 
DT’ = = ol /=gT"°) ee (22) 


Some remarks follow. 


1. Itis clear from the derivation that we get covariant conservation of the energy momentum tensor only if 
the equations of motion for the matter degrees of freedom (here the electromagnetic field) are satisfied. 
This makes physical sense of course. 


2. Here we set Siatter to Siaxwe) to be concrete, but clearly all the action—pardon, all the juice—is in 
the variation of g,,,. The electromagnetic field A,,(x) just went along for the ride. We didn’t even need 
to know in detail the expression for 5A, (x)specific and (---)? in (19). Indeed, Syratters the action for 
everything else in the universe besides the gravitational field, may very well contain 47 fields all madly 
interacting with one another. Then the first term in (19) would be replaced by the sum of 47 analogous 
terms, with Euler and Lagrange assuring us that every one of the 47 analogs of (- - -)? would vanish. 


3. I am assuming that the only field you the reader know about is the electromagnetic field, and that 
you are reading this book to learn about the gravitational field. At this stage, you might think of what 
we normally call matter as a collection of particles described by S,,,. But when you go on to quantum 
field theory, you will learn that everything in the universe is described by fields, hence the preceding 
remark. The ugly asymmetry in treating particles and fields at this level of physics in fact provides a 
strong motivation for the development of quantum field theory, as already alluded to in chapters II.3 
and IV.2. 


4. I leave it to you as an exercise to derive energy momentum conservation for Seas: 


5. Refer back to the point made in remark 2. The action for everything else in the universe besides the 
gravitational field, Satter, Contains many terms. Some terms describe interactions among different 
dynamical variables; for example, the term (13) contains both X# and A,,(x). These terms contribute 
to the equations of motion of course and hence to the derivation of (20). Thus, the conservation law 
DT’ =0 indeed takes into account the interactions among different types of matter, as it must on 
physical grounds. 


6. Notice that in determining the energy momentum tensor and in proving that it is conserved, we vary 
not the entire action of the world S = Spy + Smatier, but only Satter. This is crucial. 


7. As I have said, in theoretical physics, often the more profound results have simple, almost trivial, 
derivations, at least in hindsight. Now that you understand energy momentum conservation in curved 
spacetime, you can see that much of the long discussion leading up to (19) could in fact have been 
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dispensed with. All we need is dgsPecific — 


a —(Epig + &o:9) (See (18)) under an infinitesimal coordinate 


transformation. 


Suppose we have a generic matter action Syatter With the dynamical variable ® (whose indices, if any, we 
suppress). Then general coordinate invariance says, simply and clearly, 


0 = 5Smatter = i d‘x(- : Sb (x) SPecific a 4 / d*x./—g TPO (22) 58 ye (x) POT 


=} a d4xJ/=B TP? (x) 58 pq (x) Peat (23) 


where the third equality follows from the matter equation of motion. (As noted in remark 3 above, the mat- 
ter variation may involve a sum of terms.) Plugging in Sgspecifc, we obtain energy momentum conservation 
immediately.* See also exercise 5. 


Appendix 2: Energy momentum of the gravitational field 


The presence in (22) of the second term, mandated by the construction of the covariant derivative and the fact 
that the energy momentum tensor carries two indices, indicates that the conservation of energy momentum in 
general relativity is more subtle than you might have expected. Contrast this with the covariant conservation of a 
current (the electromagnetic current, for example) D,J’ =0, which if written out, reads rar In (/—8 J”) = 0. 


Integrate this over a 4-dimensional spacetime region V: 
4 _o— 4 = 
[i xJ/—gD, J? = o= [a XO,(/—g JP) = [480-8 J? 


where dV denotes the boundary of V and dS, a 3-dimensional “surface” element. We see that the factors of 
./=8 are arranged in precisely such a way as to allow us to use the divergence theorem suitably generalized to 
4-dimensional spacetime. This is in accord with our physical intuition that D, J° = 0 implies that current does 
not flow out of the 4-dimensional spacetime region. 

The second term in (22) tells us that (21) no longer implies that f,,, dS,./—g T?* vanishes. But this apparently 
puzzling conclusion is in fact physically correct. Since the gravitational field itself carries energy momentum, 
we cannot possibly demand that T”* does not flow out of a 4-dimensional spacetime region. 

Indeed, the equivalence principle asserts forcefully that any definition of the energy momentum carried by 
the gravitational field cannot possibly be valid locally. We know that locally, we can always transform away the 
gravitational fields. 

I might mention in passing, merely for the sake of completeness, that it is possible’ to find an object with 
two indices t?*, known as the energy momentum pseudotensor, such that 4,(7* + t?*) = 0. I strongly prefer 
to stay away from objects that are manifestly not tensors, and equations (such as the one just mentioned) that 
hold only in a specific coordinate system seem to be contrary to the very spirit of relativity. Suffice it to note that 
the ensuing discussion can become extremely involved. 


Exercises 


1 Show that T;; = —(E;E; + B; Bj) 4 15, (E? | B) and hence T = 0 in flat spacetime. 
2 Show that the stress energy tensor obtained from Spartices is covariantly conserved. 


3. Verify explicitly that T“” in (14) for a collection of charged particles is covariantly conserved. 


* Incidentally, this is very similar to the derivation of current conservation in electromagnetism using gauge 
invariance. 
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4 Show that for the action (introduced in appendix 7 in chapter V.6) 
Sscalar = — / d*xy —§ (Fee ar vq) (24) 
where (dy)? = g’? 3, 99 oP the energy momentum tensor is given by 
perN2 
TH” — ah garg — gh” (400) + v@)) (25) 


As you will learn in a course on quantum field theory, S,..1, describes a self-interacting scalar field. Evaluate 
T™ in flat spacetime. 


5  Pedagogically, the derivation of energy momentum conservation can be made even more transparent by using 
the cosmological action S. = A f d*x,/—gg% g* rather than Stawell, since there is no electromagnetic 
gauge potential to keep track of. Work this out. 


6 Note that in deriving (18), we did not use any specific property of g,,,. In other words, show that for any 
two-indexed tensor Spy (x), we have 


85 g(x )Perans = Sho (x) — Spo (x) = (Suo()8pe"(x) + Spy(X)dge" (x) + #9; 5pq(x)) + O(e?) (26) 
You can readily generalize this expression to any tensor. 


7. Suppose you are given the energy momentum tensor ofa point particle T”” (x) = —4 a fdr ax s es “34 (x 
Sa 
X(t)). Show that energy momentum conservation D,,T“” = 0 requires that the particle follows a geodesic, 


precisely as you would expect. 


8 Incosmology, a ideal fluid that exerts no pressure is known as dust. Show that with T”” = pU"U’", energy 
momentum conservation implies U“D,,U” = 0. 


Notes 


1. Historically, the realization that the sum of “different forms” of energy is conserved represents a tremendous 
conceptual advance for physics. It was enunciated by, among others, Count Rumford of the Holy Roman 
Empire. While supervising the boring of cannons in Bavaria, he noticed how hot the cannons became and 
theorized that the heat was put in there by the team of horses doing the boring work. Incidentally, Count 
Rumford was born Benjamin Thomson in Massachusetts: he fled to England on the eve of the American 
revolution, what we would call a traitor then and now. His nobility was bestowed by the ruler of Bavaria, 
whom he served. While professionally I benefited from his conservation principle, personally I benefited, 
when I lived in Munich, from the English garden he established there. Count Rumford later endowed the 
Rumford professorship at Harvard University, where he allegedly, as a poor boy growing up, audited physics 
courses without permission. Somehow they don’t make physicists like that any more. 

2. QFT Nut, p. 78. 

3. See L. D. Landau and E. M. Lifschitz, The Classical Theory of Fields, Addison-Wesley, 1971. 
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Einstein was .. . one of the friendliest of men. | had the 
impression that he was also, in an important sense, alone. Many 
very great men are lonely. 


—Freeman Dyson! 


The Einstein tensor 


In a mockery of a standard proverb, we kept putting off until tomorrow what we did not 
need today. Ever since chapter VI.1, we have been avoiding the labor of varying the Einstein- 
Hilbert action 


1 
Sey = —— | d*x/—eR 1 
EH l6xnG x & (1) 


with respect to g,,, for as long as we could. Amusingly, we managed to get pretty far; 
without ever varying Spy, we worked out an expanding universe in chapter VI.2 and the 
Schwarzschild solution around a star or a black hole in chapter VI.3. Finally, finally, we 
now do the heavy lifting and vary Spy. But as we will see presently, thanks to an identity, 
the task is not as onerous as we feared. In fact, with our setup, it is downright easy. 

So let us vary I = f d+x./—gg? Rg». We need to vary g°?, ./—g, and R,,, with respect 
to g,,). There are thus three pieces that we will attend to in turn. 

But wait, didn’t we already learn how to vary g°? and ./—g back in chapter V.6? Indeed, 
we did it again in the preceding chapter. Thus, two of the three pieces are really easy. 

First, the easiest piece: 5,Z = f d*x./—BRo bg". Use (VI.4.5), 6g°° = —g°*5g,,,8"", 
to obtain 6,2 = f d*x/—gRop(—8 589") = — f d4x./—gR" Sg, 

Next, vary the determinant g = det g,,,. Use (VI.4.7), 6./—g = 5V— 88" Seu, to obtain 
59T = f d*x./=e (4g R)5g,y- 

To keep count, we have thus far (5, + 5))Z =— f d*x./—g(R“” — 5g RSS iy The 
particular combination of Ricci and scalar tensors that appears here is so important that 


it is known as the Einstein tensor? 


Euy = Ruy = 58k (2) 
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Finally, we tackle the most frightening piece of all: 5,2 = [ d*x,/—gg""5R uv: We are 
to subtract the Ricci tensor R,,, calculated from the metric g,,,, from R uv calculated from 
Suv = Spy + 58, to obtain 5 R,,,. Given the rather complicated expression for the Riemann 
and Ricci tensors, this would seem to involve a lot of work. 


Palatini identity 


Fortunately, our lives are made easy by some key observations. To get oriented, let’s not 
worry about indices and vary the schematic expression (with antisymmetrization under- 
stood) for the Riemann tensor R’,, ~ 0.07'. +I)’, to obtain 6R° ~ 0.60", +6070", + 
I: 6I:,. So the calculation depends on first determining ite 

Look at how the Christoffel symbols transform from (II.2.29) and (V.6.6): 


Pye = SSIS YT, + SHS, 89(S- 1)", (3) 


kK WO 


At the risk of repeating ourselves, we emphasize again what we mean by varying i ,. We 
are to calculate . , using the metric g,,, and r i , using a metric g,,, slightly different from 
8,» and then calculate the difference 6, =I", —T'",. 

Confusio: “So it is not about comparing I’, and T’?.,>” 

No no no! We are varying g,,,, not transforming g,,,. 

But that’s a common confusion, because in fact we are going to use (3) right now. Under 
the same coordinate transformation as in (3), i and a are going to be related by an 
equation obtained from (3) simply by putting tildes on the I's, namely 

oa =1® (ol? F She avec 

ry = SK“ (S (sy gen V4 9S) , (4) 


n K” wo 


Subtracting (3) from (4), we obtain 
IT he = Phe Pie = SSS Toe — P30) = SSS VT oe (5 


The inhomogeneous term ~ SS~1aS~! in (3) that makes the Christoffel symbol not a 


tensor gets subtracted away. Remarkably, in contrast to’, , which is assuredly nota tensor, 


HA? 
the variation aN = re - Ns is a tensor. 

This exemplifies precisely what we talked about in chapter V.6: the “character defects” in 
Te , and ry, , cancel out. Exploiting this fact, we can derive a nice identity for the variation 
of the Riemann curvature tensor. 

Professor Flat pops up just in time, up to his usual trick. “Go to locally flat coordinates,” 
he urges us. Look at 6R’,,. ~ 0,61), + 607.0), +17,6T!,. At a point where the coordinates 


are flat, P(x) = 0, and our expression simplifies to 5R’,,. ~ 4.51", so that 
BRigy(X) = 8,54, (&) — 9,502, (x) 6 


But at that point, since I’, = 0, the ordinary derivative 0 is the same as the covariant 
derivative D, thus enabling us to write 


po p 
BRiign a DST iy - D,sTr (7) 


an equality known as the Palatini identity. 
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Professor Flat reminds us once again that since the Palatini identity is an equality 
between tensors, it holds not only for locally flat coordinates, but in general! 
In particular, the variation of the Ricci tensor is given by 


BRyy = 5RE,, = DT’, — D,sVe,, (8) 


and so 632 = [ d*x./gg"”(D, oT", — D,d1t,,). Now integrate by parts (as explained in 
chapter V.6) and use the identity (V.6.15) D,g“” = 0. Happily, we conclude that 637 is a 
surface term and does not affect the equation of motion! 


The Einstein field equation 


Putting it altogether, we have 5Z = (6, + 5) + 63)Z = — f d*x./—g(R"” — 59" R)dgy,+ 
a surface term.? Remember, the 5 comes from varying the square root, and the relative 
minus sign from varying the inverse. 

In chapter VI.1, we tried to avoid work and simply denote the relative coefficient between 
R”” and g#”R by a. This unknown constant a turns out to be —5 # -i, as we had hoped 
all along, and so everything we did in chapters VI.2 and VI.3 is okay. (Also note that 
A =~—1, and so the result A(1 + 4a) = 1 we obtained in appendix 1 to chapter VI.1 is 
indeed satisfied.) 

Noting that Sey = weal , we obtain 


1 
167G 


6S = b Spy + b Satter = / d*xy &§ { (RY 5e”R) a yr} 88 uv (9) 


We have derived the wondrous glorious stupendous tremendous Einstein’s field equa- 
tion 


RY’ — 3 9""R=+48nGT" (10) 


A parade of greats, Euler, Lagrange, Riemann, Christoffel, Ricci, Einstein, Palatini, and 
many many others, have brought us to this, one of the most profound statements in 
physics: The distribution of energy in spacetime governs the curvature of spacetime. 

Often T“” is given explicitly in a relative simple form, for example, T’” = (9 + 
P)U“U” + Pg” for a perfect fluid. A trivial rewrite of (10) is thus more convenient: 
contract (10) with g,,, to obtain R = —87GT so that 


RM = 8xG (TH - 39") (11) 


Einstein remarked that the left hand side of his field equation (10) was born elegantly of 
geometry, while the right hand side seemed to have an ugly ad hoc quality, with one term 
after another thrown in* according to what kind of matter we chose to fill spacetime with. 


* This was particularly true in Einstein’s time, when matter was poorly understood. Hence Einstein’s unreal- 
ized dream of a unified field theory with matter also described in geometrical terms. 
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Theoretical physicists have joked ever since that even Nature appears to prefer the left over 
the right. 


The Newtonian limit 


We have one remaining task: show that G is what Newton said it was. Recover Newtonian 
gravity in the appropriate limit. In other words, we will verify the overall coefficient in the 
Einstein-Hilbert action (1). 

Consider the weak gravitational field limit in which spacetime deviates little from 
Minkowskian. Write g,,, =,» +,, and expand R,,, to linear order in h. First, M= 
50P* (Nyy + yhya, — Ahyy) + O(h2), so that Boia Oak ast ek ue) = SO ae 

= 2 
Po ip) = uly — OES) + O(h*). We obtain 


1 
Ries 5 OP yy 8,,d,h% — 0,0,h% + 8, 0,h%) + O(n?) (12) 


where evidently, to leading order, indices are raised by the Minkowski metric: h* = nh, 

Recall, from chapter V.4, that to recover Newtonian gravity we only need h,,,, not to 
depend on time. Then Roo X — 5V7hoo = V’®, upon noting (from chapter V.4, and earlier, 
chapter IV.3) that the Newtonian potential is given by ® = — hoo. (Recall also (V.4.24) 
in which we restored c and took the nonrelativistic limit.) For Newtonian matter, Tog ~ 
p > T;;, so that Toy — 5nooT a 5p. The field equation (11) thus reduces to V* = 42 Gp, 
showing that we have the right coefficient in (1). 


Dark energy again 


Back in chapter VI.2, I did not vary the action Scosmological = — f d*x./—gA but merely 
argued on symmetry grounds what the energy momentum tensor of an expanding uni- 
verse must be, up to an overall constant. Using (VI.4.8), we can now vary instantaneously: 
5 Scosmological = — f 44x (3./—88"")58,,,A and thus by (VI.4.2), 


THY =—Agt (13) 


In the flat spacetime limit, the energy density becomes T° = —An = A as expected. 
With T = 4A, the field equation (11) then reads 


RY” = 81GAgh” (14) 


The temporary symbol A used in chapter VI.2 is in fact just 87GA, so that the Hubble 
parameter was determined to be 


Etre 
3 


H? (15) 


392 | VI. Einstein’s Field Equation Derived and Put to Work 


(The slight cheat I committed back in chapter VI.2 was assuming implicitly that A and A 
have the same sign.) Note that the expanding universe (with its flat space*) we discussed 
in chapter VI.2 only makes sense with a positive cosmological constant. 

There is a silly debate regarding whether in (14) the A term should be placed on the 
left or right hand side, silly because those of you who have mastered algebra certainly feel 
free (free country, remember?) to move it to the left hand side. But some people who write 
the A term on the left hand side then think of it as part of gravity, and go on to say that 
anti-gravity, or a new repulsive force, has been found, a language I strongly disfavor. The 
justification for this ill-advised language is that the only dynamical variable appearing in 
Scosmological 18 the metric. But as we saw in chapter VI.2, Scosmological is inevitably produced 
by fluctuating matter fields. 


Bianchi identity 


One reason that it took Einstein 10 arduous years, from 1905 to 1915, to derive the field 
equation was that he didn’t know an identity due to Luigi Bianchi (1856-1928), which 
we will now derive. We’ve had an inkling of the existence of this identity ever since 
chapter VI.2. 

Let us covariantly differentiate the Riemann curvature tensor D,Ri, op = Di{(e Ty + 
1 Deere Roe Cire re as ,Viic)}- Naturally, our favorite person pops up. Go to locally flat 
coordinates, he mimes. Then at the chosen point, the Christoffel symbol vanishes, and 
this whole mess collapses dramatically to 0, Ryn, = 9960 p.na — 9y9,0 p.yo+ (Note that 
we have lowered the upper index.) Observe that the index structure on the right hand side 
has the form (vo A) — (vA). So, cyclically permute the three indices (voA) and add the 
results. Out pops the Bianchi identity 


Dy Royor + Dg Royav + Dy Rove =0 (16) 


As always, because this is a tensor identity, it holds in general, even though it is derived in 

locally flat coordinates. You shouldn’t even need Professor Flat to tell you that any more. 
Contract this with g?”. Remembering that the covariant derivative of the metric tensor 

vanishes so that we can slip the metric tensor past the covariant derivative D, we obtain 


DyRuy + D?Royav — Di Ryy =9 (17) 


where we used the antisymmetry of the Riemann tensor in the last two indices. Contracting 
again with g’*, we find D,R — D?R,,, — D"R,,, = 0, or written more compactly, 


D" (Ruy — 38,yR) = D“E,,, = 0 (18) 


* We will discuss the expanding universe with curved space when we explore cosmology in part VIII. 
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This identity satisfied by the Einstein tensor (and a direct consequence of (16)) is some- 
times also referred to as the Bianchi identity, although it should, more properly, be known 
as the contracted Bianchi identity. 


Einstein’s real blunder 


Applying the contracted Bianchi identity (18) to the field equation (10), we obtain energy 
momentum conservation 


DT 20 (19) 


In most applications, we deal with an energy momentum tensor known to be conserved, so 
that the Bianchi identity actually is telling us that the field equations are not independent: 
one linear combination of the derivatives of the different equations vanishes identically. 

Let us go back to the apparent miracle in chapters VI.2 and V1.3. In both, when we solved 
for the metric, we had one more equation than unknowns but nevertheless we obtained 
a consistent solution. Now we understand what is going on, thanks to Bianchi. We over 
counted the number of equations by one: one linear combination is satisfied automatically. 
In practice, this provides us with a useful check on our arithmetic. 

In his epoch-making 1915 paper on gravity, Einstein actually did not use the action 
principle, but wrote down the field equation directly by arguing what the left hand side 
must be. (We will come back to this in chapter IX.5.) For a number of years, he struggled 
with this approach, and at one point wrote down R“” = 82 GT”, which did not work, 
since Bianchi identity applied to this would give D“T,,,, 4 0. 

There is a lot of quasi-nonsense written about Einstein’s greatest blunder in introducing 
the cosmological constant* which I find rather annoying.* In my opinion, ifthe great man 
had blundered at all, it was in not using the action principle (see, however, appendix 5). 


Appendix 1: Bianchi identity 


The Bianchi identity can also be derived as a special case of the Jacobi identity [A, [B, C]]+[B,[C, A]]+ 
[C, [A, B]] = 0, which you can prove by writing out all the terms on the left hand side. Here A, B, C are three 
operators. (Or, you could argue that there are 2 x 2 x 3= 12 terms on the left hand side, so that each of the 6 
possible terms, for example ABC, must appear twice, once with a positive sign (in [A, [B, C]] in the example) 
and once with a negative sign (in [C, [A, B]] in the example).) In particular, 


[Dus [Pv Dall + [Pv [Das Pull + [Pas [Pu Poll = 9 (20) 


Using (VI-1.5) [Dy Dy|Sp = —R°,,,, So and the fact that the covariant derivative is distributive, we obtain the 
Bianchi identity (16). 

We now see that the “other half of Maxwell’s equations” «?“"*d,, F,,, = 0 discussed in chapter IV.2 are in fact 
Bianchi identities as we could see by setting the covariant derivatives in (20) to the quantum mechanical covariant 


* Contrary to what is constantly reported, Einstein never said in print that the cosmological constant was his 
greatest blunder. It was George Gamow, a jokester of record, who wrote that Einstein told him as much. 
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derivative of electromagnetism in flat spacetime D, = (3, — i A,). (See an endnote in chapter VI.1 fora derivation 


of the i.) 
Appendix 2: Another derivation of the contracted Bianchi identity 


Itis illuminating to give yet another derivation of the contracted Bianchi identity. We simply modify the discussion 
given in appendix 1 of chapter VI.4. There we derived energy momentum conservation using for the prototypical 
matter action the Maxwell action Syya,we. We are going to do unto the Einstein-Hilbert action S_,, here what we 
did unto Sy\axwet! there. It would be helpful for you to review the appendix in question now, as I am not going to 
write out the steps in detail again. 

We know that Spy is invariant under the replacements x > x’, g,¢(x) > Bie (x’), and g#? (x) > g/#%(x’). It 
is instructive to compare the consequence of general invariance for Spy and Syaxwel!- There, in chapter VI.4, after 
a few steps we find that the general invariance of Sysaywer implies (VI.4.19) 


0 = 8Smfaxwell = / EG PALE = | i. dt x8 T™ (288 p(x) (21) 


Here, applying the same reasoning, we find that the general invariance of Spy; implies 


1 : 
0= 8Sn= / dS xB (RPT — 3 gP RY (x) g po (x) PERE 


1 
167G 


[ate =@EP ep ayer (22) 


Note two differences between (22) and (21). First, there is not a 5A term in (22), of course, since here we 
don’t even have an A field to vary. Second, while the variation of Sya,wel) With respect to the metric gives the 
energy momentum tensor of the electromagnetic field, the variation of Sj with respect to the metric gives 
the Einstein tensor. 

As explained in appendix 1 of chapter VI.4, in (21) we then invoke the equation of motion for A, use 
Sgepertic = —(Epg + €g;p), and integrate by parts, thus obtaining energy momentum conservation of the matter 
fields D,T°? = 0. 

In (22), using the form of eae and integrating by parts, we obtain the contracted Bianchi identity 
D,E°? =0. 

While the derivations of the two equations D,T?° =0 and D,E°? = 0 appear superficially similar, we must 
keep in mind an important conceptual difference. Energy momentum conservation of the matter fields follow 
only if the matter fields satisfy their respective equations of motion, as makes sense physically. In contrast, the 
contracted Bianchi identity is, well, an identity. This is rendered particularly clear in the derivation given here: 
in one case, we have to invoke the relevant equation of motion, in the other case, not. 

The contracted Bianchi identity D,, E?” = 0 and Einstein’s field equation E?° = 167GT?° imply energy mo- 
mentum conservation D,T°? = 0. Conversely, the field equation and energy momentum conservation demand 
the existence of an identity. As was mentioned in the text, Einstein’s ignorance of the contracted Bianchi identity 
contributed to his difficulties in arriving at his field equation. 


Appendix 3: The “total energy momentum tensor” vanishes 


If we wish, we can extend the definition (VI.4.2) for the energy momentum tensor to the Einstein-Hilbert action 


(1) and define Tae = ra ul = — 45 (R“” — 5g”"R), so that the field equation can be written as 
im 
Travity*) + Tmatter®) = 0 (23) 


Some authors like to say that Einstein’s field equation tells us that the total energy momentum tensor is equal 
to zero. I do not find this formulation particularly useful.* 


* For instance, we could move ma in F = ma to the left hand side, define —ma as the “inertial force,” and say 
that the total force vanishes. Not a terribly useful way to think about the second law. 
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Appendix 4: More on the variation of the Ricci tensor 


We give an alternative, and perhaps simpler, proof that 5R,,, is a total derivative. From general considerations, 
this tensor has to be constructed out of 5g,. and two powers of the covariant derivative. Let’s classify all possible 
terms according to where the indices wv go. There are three possibilities. Both indices are carried by 5g: we have 
D5 uy. One of them is carried by 5g: we have D,,D,5g,, (plus of course the term obtained by exchanging 
mand v). Neither index is carried by 5g: we have Dy Dy8*88 x (recall that D,g°* =0, and so it does not 
matter whether the inverse metric is inside or outside the covariant derivatives). Each of these terms is a total 
derivative. 


Appendix 5: Palatini (actually Einstein) formalism 


Here we write the action for Einstein gravity in the Palatini formalism 


1 


= sag | dts VHB” Ryo OTT) + Syne (24) 


When shown this for the first time, your immediate reaction might be, “What? Isn’t this the same thing as 
what we had in the text?” Indeed, your indignation is justified, but I have not yet specified for you the dynamical 
variables, which you must insist on knowing immediately when you are shown an action. 

Attilio Palatini chose to regard the metric g,,,, and the Christoffel symbol rm, as independent dynamical 
variables. You say, “How could that be? Isn’t the Christoffel symbol defined in terms of the metric?” Yes, you are 
totally right, but only in the standard formalism. You are now invited to contemplate the action in (24), in which 


the symbol R,,,,(aI", I’) is to be regarded as shorthand for 
R,, (OP, P) = (0,1 ", + La Di (0,0 + an ee) (25) 


with I some unknown object carrying 3 indices. Note that the first term in (24) now involves only one power of 
derivative; thus, the Palatini formalism is also known as the first order formalism for gravity. 

Ask not what you can do for the Palatini formalism; ask what the Palatini formalism can do for you. 
What it can do is to render the variation of S with respect to g,,, “trivially” easy, since we don’t have to vary 
R,,,(@V, P): it doesn’t contain g,,,. We almost “instantly” obtain Einstein’s field equation R“” — gs R= 
+82 GT". 

But not so fast! We haven't quite gotten Einstein’s field equation yet, since at this point, R,,,, was just a symbol 
for the mess in (25) involving the unknown object I’. We get Einstein’s field equation only after we determine I 
in terms of the metric. 

How do we do that? Remember that we are treating [ as an independent dynamical variable. Euler and 
Lagrange tell us that we are obliged to also vary S with respect to I. Since Satter does not depend on I’, we only 
have to vary g""R,,, (01, ) in the first term in S. 

To avoid drowning in a sea of indices, let’s do it schematically, living what I call the unindexed life: 


8 (V=eg° OT), +0201)) ~ J=89" (0.807, + 60102 +0180) ~ M8 .8" + 9° TOP, (26) 


where in the last step, we integrated by parts, which we are effectively allowed to do since this variation is to be 
performed inside an integral. Thus, upon setting the coefficient of 6T'!, to zero, we end up with an equation of 
the schematic* form I ~ gdg. 

Three guesses on what the equation turns out to be when you keep track of the indices carefully. If you made 


it this far in this book, surely you guessed Ma 58? (Oy 8va + O83, — 8). What else could it be? Consider 


* Hal We are now so schematic that we even drop the dots. 
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this proved by the “what else could it be” method. You are of course urged to put all the indices back and check 
this statement by direct computation. 


Appendix 6: “The wretchedness of humanity” 


In this book, I emphasize the action. In contrast, most textbooks I know approach the subject by trying to find an 
equation of motion that reduces to Newton’s V?® = 4 Gp. I prefer the action approach, because contemporary 
theoretical physics at the fundamental level deals mostly with actions and Lagrangians, not equations of motion.* 
The equation of motion approach goes something like the following. We know that ® measures the deviation of 
800 from its Minkowski value, so let’s look for something with two derivatives acting on g,,, that would reduce to 
V?®. We then argue that this something must be R“” + Bg" R, with some unknown coefficient f. But for this 
to be equal to T“”, we must have energy momentum conservation and thus D,, (RY + Bg"”R) = 0, which then 
fixes B after some calculation. Arriving at Einstein’s field equation (10) this way is of course entirely equivalent 
to the action approach. Each to his or her own taste, but if you are to move on to field theory and string theory, 
you better get used to where the action is. 

Ever since I was a student, I wondered why Einstein, who was surely familiar with the action principle, did not 
follow the action principle, which would not have demanded that he knew the Bianchi identity and thus would 
have significantly lessened his struggle. The answer is that, in fact, he did! 

So a bit of history> I learned while writing this text. Einstein was smarter than the textbooks that follow the 
equation of motion approach make him out to be. He and his friend Marcel Grossmann published a paper in 
1914 about a variational principle for gravity and then wrote to Lorentz about it. Stimulated by this letter, Lorentz 
published a paper in 1915 varying a Lagrangian C(g, dg) without specifing what £ was. Then, ina paper presented 
to the Royal Prussian Academy of Sciences on November 4, 1915, Einstein obtained a set of field equations using 
the action principle, but with an £ that is not a scalar! Furthermore, he imposed the condition det(g,,,) = —1. 
Three weeks later, on November 25, 1915, Einstein presented to the same academy his field equations, but without 
using the action principle. 

But Einstein was scooped! On November 20, David Hilbert presented to the Gottingen Academy the gravita- 
tional field equations he derived by varying an action. This action, as you know, is now called the Einstein-Hilbert 
action. Quite rightly in the opinion of all physicists, Einstein is credited with this action, even though strictly 
speaking, he found the equations of motion that emerge from the action rather than the action itself. The theo- 
retical physics community is not a court of law: it regards Hilbert, although he did find the action first, as playing 
second fiddle to Einstein. 

Incidentally, historians of physics have come to a “belated decision in the Hilbert-Einstein priority dispute. 

But at that time, Einstein didn’t know that history would be kind to him in this one respect. He was justifiably 
worried and, perhaps less justifiably, angry. In fact, he was sufficiently incensed as to dash off a letter on Novem- 
ber 26 to a friend. In the letter, the great man also bitterly denounced his estranged wife for her influence on 


6 


their children,” but before launching into a diatribe about his personal life, he first accused Hilbert of stealing 
his theory. 

Einstein wrote, “The theory is of incomparable beauty. But only one colleague has really understood it, and he 
is trying, rather skillfully, to ‘nostrify’ it. That’s Max Abraham’s coinage. In my personal experience, I’ve hardly 
come to know the wretchedness of humanity better than in connection with this theory.”® Well, dear reader, 
nostrification is not only still practiced in theoretical physics, but ever more skillfully. 

One could fantasize that had Einstein mastered the action approach and Riemann’s work, his travails over 
the 10 years from special to general relativity could have been replaced by an inspired guess. Hilbert had a 
tremendous advantage over the befuddled Einstein struggling to learn differential geometry: he was a leading 
mathematician who obviously already knew the subject forward and backward. Moreover, he had worked on 
the theory of invariants, a branch of mathematics concerned with the question of what is left unchanged by 
a given set of transformations. He knew that the scalar curvature does not change under general coordinate 
transformations. Thus, once the question was posed properly, Hilbert knew instantly that the sought-for action 
governing spacetime must be the scalar curvature. 


* Just flip through any textbook on quantum field theory. 
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Here is a coda to the story. Now that Lorentz knew what £ was, he studied* the action principle for gravity in 
a series of papers, using a more general Syatter than Hilbert did. As for Einstein, a year later, on November 26, 
1916, he presented a paper titled “Hamilton’s Principle and the General Theory of Relativity” in which he wrote 
pointedly, “We shall make as few specializing assumptions as possible, in marked contrast to Hilbert’s treatment.” 

Now I come back to the Palatini formalism. It is nowhere to be found in the paper Palatini presented to 
the Circolo Matematico di Palermo on August 10, 1919! He improved Hilbert’s calculation and in fact had the 
Palatini identity introduced in this chapter. What the textbooks now call the Palatini formalism was actually 
invented in 1925 by Einstein! As the years passed, apparently people mixed up the Palatini identity and the 
Palatini formalism, and various people, including Einstein,’ referred to (24) as the Palatini formalism. After all, 
both the Palatini identity and the Palatini formalism absolve us of having to vary the Ricci tensor, so it is easy to 
mix up the two. This confusion has been perpetuated unwittingly in many textbooks (and wittingly in this one). 


Appendix 7: An alternative form of the Einstein-Hilbert action 


We can rewrite the Einstein-Hilbert action in an alternative form that I do not like for reasons that will become 
clear but that I mention here for the sake of completeness. Looking up the expression for the Ricci tensor given 
in (25), we write the action in (1) (suppressing the irrelevant overall constant) as 


s=f dey 8” Ryy = f dbx Fond CS ro) rn os Ad AN 
= S,+ Si + Sin + Siv (27) 


We next integrate by parts to get rid of the derivatives on the Christoffel symbols, assuming that all surface terms 
can be thrown away. First we have 


si f ate/=H0"9,0%, =— f dba, /=He WTS, 


Differentiating, we write 0,(./—gg?”) = V—8(07,8°" + 968°"). Invoking the identity D,g?® =0=d,g°" + 
reg + TY, gh, we can then write S; = — f d*x./—8 (07 8” - reg? - are aa a Similarly, we obtain 


Sy = ‘l d*x J/g", o, = / d*xd,(/—89" Te, 
= f ats /=e0r},2”” - The” —Ph,e0%, 


Adding, we find S,+ Sy =2 f Cis 7 ee NCS - (ane Eb We next observe that Sy; + Spy, as written 


in (27) with no further massaging needed, is just, interestingly, — } times this expression. Thus, we end up with 
an alternative form for the Einstein-Hilbert action 


S= / d*x/—gr oT ge - Oi,g 4, (28) 


The reason that I don’t like this form of the action is now clear: the integrand is not a scalar under coordinate 
transformation, as is the case for the Einstein-Hilbert action. Throwing boundary terms away at will has done 
violence to the underlying invariance properties of the action. Nevertheless, this form of the action is useful in 
some situations. 


* Lorentz mentioned that the Belgian Théophile Ernest de Donder (1872-1957) also contributed. History 
has not been kind to de Donder. (See, however, the footnote on p. 21 of QFT Nut.) The harmonic gauge to be 
introduced in chapter IX.4 is also known as the de Donder gauge. Perhaps you have seen the famous photograph 
of the post-quantum mechanics 1927 Solvay Conference? De Donder stood behind Dirac, who in turn was seated 
behind Lorentz and Einstein. 

T By that time, Einstein, assured of his place in history, could afford to be generous to a minor figure like 
Palatini. 
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Appendix 8: Diffeomorphism 


Our good friend the Jargon Guy has been lobbying us to use the word diffeomorphism. “How can you write a 
book on general relativity without saying diffeomorphism?” Fine. We will say it,? here being as good a place 
as anywhere. A map f of a manifold M onto itself that is smooth, differentiable, and invertible is called a 
diffeomorphism. (Think of a sphere S? being mapped onto itself.) Let a point P on the manifold be mapped 
toa point Q = f(P). Suppose that associated with each point P is a number 7 (P). In other words, the function 
T is defined on the manifold. Suppose also that the diffeomorphism moves the number 7(P) to the point Q. 
We define a new function T by 


T(Q)=T(P)=T(f“(Q)) (29) 


What thrills our friend the Jargon Guy is that all this could be done without “dirtying our hands” with 
coordinates. Physically, we can imagine an incompressible fluid flowing on the sphere S*. A fluid element that 
is at the point P now will arrive at the point Q after some time At. The number T(P) could then represent a 
physical property associated with the fluid element at P on the manifold. (For example, we could think of T as 
temperature. Assuming that there is no heat transfer between neighboring fluid elements, T(Q) would then be 
the temperature at Q at a time Ar from now.) The diffeomorphism discussed here is also known as an active 
diffeomorphism. 

Descending to physics, we are now crass enough to introduce coordinates to cover local patches on the 
manifold, as we have done since early on in this book. To please our friend, we put on our mathematical hats 
and regard the coordinates x” as an invertible smooth map x : M > R®, associating d real numbers x“(P) with 
each point P on the d-dimensional manifold M. We can now define T(x) =T(P(a)), with P(x) denoting the 
inverse map x‘-) : R¢ — M; in other words, T(x) is the composition of this map with the map T: M —> R. 
Most physicists would probably drop the hat on 7 and simply write T(x), but conceptually, 7 and T are entirely 
different creatures, and we will maintain the distinction for now to keep our friend happy. 

Let us change coordinates x > x’ = F(x) so that as usual 


7'(x') = T(x) = TF’) (30) 


The rigor minded refer to coordinate transformation as a passive diffeomorphism to distinguish it from an active 
diffeomorphism. 

Now comes the key point: (29) and (30) are structurally the same with f “corresponding” to F. Hence, an 
action that does not distinguish between different coordinate choices is invariant under active diffeomorphisms. 
But that has been precisely what we’ve been doing all along, but not stating explicitly. The whole point of this 
appendix* is to formalize the obvious (and to show off our knowledge of the word “diffeomorphism” to our 
friend). 


Exercises 


1 Check that varying the Palatini action with respect to I", gives the usual relation between the Christoffel 
symbol and the metric. 


2 Show that the Einstein tensor E,,, vanishes identically in 2-dimensional spacetime. 


* After I wrote this appendix during final revision of this book, I sent it to a colleague, one of those distinguished 
physicists listed in the preface. He sent back a terse email, complaining that the inclusion of diffeomorphism is 
“like a scratch on a record playing the sublime music of gravity.” Then he told me, if I must include this kind of 
stuff, to find a better place to hide it. 
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Notes 


1. F. Dyson, in New York Review of Books. 
2. The notation G,,, is commonly used. In later chapters, we will use G,,, to denote the metric in higher 
dimensional spacetime. 
3. We are assuming that spacetime does not have a boundary, so that we can ignore this surface term. If 
spacetime has a boundary, then we can add to the action an additional term defined on the boundary, known 
as the Gibbons-Hawking-York boundary term, whose variation with respect to g,,, is designed to cancel the 
surface term we encounter here. 
4. I would like to see discussions of whether the cosmological constant represents Einstein’s greatest blunder 
banished forever from any sensible discourse on gravity. There is little hope of that. 
5. M. Ferraris, M. Francaviglia, and C. Reina, “Variational Formulation of General Relativity from 1915 to 1925 
‘Palatinis Method’ Discovered by Einstein in 1925,” Gen. Rel. Grav. 14 (1982), pp. 243-254. Since the authors 
are all Italians, I would surmise that their curt dismissal of Palatini was not driven by nationalistic pride. 
6. From the abstract of the paper “Belated Decision in the Hilbert-Einstein Priority Dispute” by L. Corry, J. Renn, 
and J. Stachel, Science 278 (1997), p. 1270: “A close analysis of archival material reveals that Hilbert did not 
anticipate Einstein.” Also, Hilbert apparently did not know how to get the in (10). 
7. Einstein wrote: 
My son [the 11-year-old Hans Albert] still hasn’t answered my inquiry about meeting. . . . Thatis surely 
the influence of the woman. . . . You'll see more and more, on which side goodwill and honesty are 
to be found. There are reasons that I couldn't abide staying with that woman, despite the tender love 
that binds me to my children. When we first separated, the thought of my children stabbed me like a 
dagger every morning when I woke up. Nonetheless, I never regret having taken the step. 
Quoted in Physics Today, October 2005, p. 18. 
8. Quoted in A. Félsing, Albert Einstein: A Biography, Viking, 1997. 
9. I follow the discussion in Quantum Gravity by C. Rovelli, Cambridge University Press, 2004. 


Initial Value Problems and Numerical Relativity 


The initial value or Cauchy problem 


The nonlinearity of Einstein’s field equation renders exact solutions rather unlikely, except 
in situations endowed with a high degree of symmetry. The advent of computers has thus 
resulted in the booming field of numerical relativity.! It is definitely beyond the scope of 
this textbook to go into details of numerical relativity, with its highly sophisticated methods 
and approaches. Rather, the purpose here is to acquaint the reader with the formulation 
of the initial value problem (also known as the Cauchy problem) in Einstein gravity.’ 

We all know how the initial value problem works in Newtonian mechanics. If we know 
the position q of a particle and its velocity 4 at some time fy, Newton’s law allows us to 
determine a4 at that time. We thus know the position g of the particle and its velocity 
a at time fy + dt for 5¢ an infinitesimal. We then repeat this procedure, using a process 
known as integration. 

The initial value problem is particularly well suited to numerical work. Given the initial 
data, namely the position g of the particle and its velocity a2 at an initial time fo, the 
computer has to be instructed on how to generate the corresponding data at a later time 
to + dt, with 6t small but finite. Once instructed, the computer can then blast away and 
calculate the particle’s position and velocity at some later time to + T. It is of course a 
science (and an art) to determine the optimal choice of St for a given T and to make sure 
that the roundoff errors at each step do not accumulate out of control. 

This basic scheme of evolving the initial data in time can be immediately generalized, 
first to the case of many particles (with the initial data now consisting of the position Gq, and 


the velocity “ 


< of the N particles, a =1,---, N) and then to fields. Consider, for example, 
the scalar wave equation (described in chapters II.3 and III.6) (0,,0% — m)(x) = 0, which 
we now write as 0$¢(t, X) = (V? — m”)¢(t, X). Note that the conceptual jump from many 
particles to field involves promoting? the discrete index a to the continuous spatial variable 


Xx (and trivially changing notation q > @). 
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The initial data now consist of the two functions $(0, x) and d9@(0, x). For ease of 
writing, we set fy = 0 now and henceforth. The scalar wave equation then allows us to evolve 
these data in time. Knowing 09(0, x), we can determine (6t, x). The wave equation 
then gives dsb (0, x), which then allows us to determine 094 (St, x). The key here, as for 
Newtonian mechanics, is that the equation of motion is second order, that is, it contains 
a= a Indeed, Newton's deep insight was that dynamics involve the second derivative, 
not the first. 


Gauge freedom and initial value in Maxwell electromagnetism 


All this is straightforward and elementary; the subtlety first arises in gauge theory. Consider 
Maxwell electromagnetism. We are to solve (IV.2.13) 


a,Fe =—J” (1) 


for the vector potential A,,(x). (Since F,,, = 0,,A, — 0,A,,, the other “half” of Maxwell's 
equations ¢°°""0, F,,, = 0 is identically satisfied.) Everything would appear to be the same 
as before. Given the initial data A,,(0, x) and 09A,,(0, x), plus J”(0, x), we can then use (1), 
which are again (apparently) second order differential equations in time, that is, equations 
containing ¢, to evolve the initial data. (Of course, we also have to include the equations 
that tell the charges how to move, that is, how J”(0, x) changes with time, but that half of 
the story is not the focus of our discussion here.) It would seem that the four equations in 
(1) determine the four functions A,,(x). 

Not so fast! This is a gauge theory, and hence A,,(x) and A,,(x) =A,,(x) + 0, A(x) 
correspond to the same physics. In other words, the subsequent value of A,,(x) should 
be determined only up to an arbitrary function A(x) (taken to vanish smoothly together 
with its first time derivative dj A(x) as x° — 0, so that A , and A have the same initial 
data). 

The resolution of this apparent paradox is simply that the v = 0 equation in (1) does 
not involve 9¢ and is thus not a time evolution equation. We can see this explicitly: 
—J°=0,F"° = 4; F'° = 4,(0' A® — 9°A'), and hence this equation contains only 1 power 
of dp. 

It would be good to express this more physically. If we write out this equation in more 


elementary notation, we recognize it as simply Gauss’s law V - E =p. In the initial data, 
once we write down some initial charge distribution p(0, x), we are not allowed to write 
down any old A,,(0, x) and d9A,,(0, x); these eight functions of x must lead to an electric 
field E(0, x) satisfying Gauss’s law. This makes physical sense. 

We conclude that of the 4 equations in (1), one merely imposes a constraint on the initial 
data A,,(0, x) and d9A,,(0, x). There are only three time evolution equations, which do not 
determine the 4 unknown functions A,,(t, x) completely. But that is exactly right: A,,(¢, x) 
should be determined only up to the gauge function A(f, x). 

Let us give another demonstration that 0,, F“° contains only one power of dg. The proof 
will be by contradiction. Consider the trivial identity 0,0,,F“” = 0, which follows from the 
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antisymmetry of F“” and the fact that 0, and d,, commute. Write out this identity* as 
99 (0, F H0) — 9, 0,F"'. Suppose 0,,F 0 contains 2 powers of 9. Then the left hand side 
of this identity would contain three powers of 49, but the right hand side manifestly has 
no room for three powers of dp. Thus, we have proved what we set out to prove. 

This proof using a trivial identity might seem to be overkill, given that the property we 
want to prove, that 0, F“° does not contain 2 powers of dp, is something we can see by 
eyeball, as indicated above. However, this line of attack will turn out to be useful in the 
context of Einstein gravity. 

If Gauss’s law holds at the initial time, we expect it to continue to hold as the charges 
rush madly about with the electric field changing accordingly. To verify this, simply 
differentiate the quantity V - E — p with respect to time: )(V - E — p) =V - dE — ap = 
V-(V x B—J)+V-J =0, where we used a Maxwell equation and current conservation. 
Thus, ifthe quantity V - E — p vanishes at the initial time, it will continue to vanish at later 
times. 

This physical check also leads to the trivial identity we used earlier. Write the quantity we 
just calculated as 09(0; F!° + J°), and you see that its relativistic completion is 0,(0,,F"” + 
J”), which with 0, J” = 0 is just 0,0,,F"". 


Gauge freedom and initial value in Einstein gravity 


We are now warmed up sufficiently to tackle Einstein’s field equations 


EM = RY 5oR= 7" (2) 
(where we have dropped the 167 G for convenience). We expect that the initial data on the 
t = 0 slice of spacetime consist of g,,,(0, x) and d9g,,,(0, x). (In general, the initial data 
will be specified on some spacelike hypersurface, known as the Cauchy surface; we simply 
assume that we can choose the f coordinate so that the hypersurface is described by t = 0, 
at least locally.) Since the field equations contain two powers of spacetime derivative, as 
we have long known, we would think that, given the metric and how it is changing with 
time at t = 0, the field equations will give us 02 6 Suv 0, x). If so, then we could time evolve 
the metric as in examples of Newtonian mechanics and the scalar wave equation. The 10 
equations in (2) would then determine the 10 unknown functions g,,,,(x). 

But we are forewarned by the example of Maxwell’s equations that things may not be 
so simple. Indeed, the invariance of physics under general coordinate transformation 
x! —> x’¥ (x) allows us to eliminate 4 of the 10 unknown functions g,,,,(x). In other words, 
Einstein’s field equations should not determine g,,,,(x) completely. Our experience with 
the Maxwell case suggests that 4 of the 10 equations in (2) are not time evolution equations 
but merely constraints on the initial data. 


* Note that this identity together with Maxwell’s equation implies that the current must be conserved: 
3,(0,F"” — J”) =0 gives a,J” =0. 
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Now that I have explained to you how things work in the Maxwell case, it should be fairly 
clear to you how things work in the Einstein case. But historically, the great Einstein was 
confused at this point. He concluded that he was faced with a logical choice: either (a) the 
field equations are not deterministic, or (b) invariance of physics under general coordinate 
transformation is too strong a requirement. Unfortunately, in 1914, he chose to abandon 
invariance of physics under general coordinate transformation. Ouch, Albert, the wrong 
choice! Fortunately, in 1915, with Hilbert breathing down his neck, Einstein’s brainpower 
kicked into high gear, and he realized that the correct choice was (a). The field equations 
do not, and should not, determine g,,,,(x) completely. 

So, give me your best guess before reading on: which four of the equations in (2) are 
not time evolution equations, that is, do not contain 04 acting on the metric? 

Right, the claim is that E°” = T°” amounts to constraints on g,,,,(0, x) and dog, (0, *). 
Of course, you could verify this claim by working out E°” directly, but things here are 
not as simple as seeing by eyeball that 4,,F“° does not contain dj. As suggested by 
the Maxwell example, we need the analog of the identity 0,0,,F“” = 0, namely the con- 
tracted Bianchi identity D,, E“” = 0. Let us write this out in longhand: d9£° = —d; E'” + 
terms involving the Christoffel symbols I’ times various Es. 

Again, the proof is by contradiction. Suppose that E°” contains 0%. Then the left hand 
side of the preceding equation contains 03. But nowhere on the right hand side can we 
find 93. (For example, the Christoffel symbols contain dy and the Es contain two powers of 
49.) Contradiction and QED. In other words, the two powers of dp contained in E°” must 
appear in the form gdg..dog... The four equations E°” = T°” merely constrain the initial 
data. (We are implicitly assuming that T°” does not contain dé, as is true of the standard* 
energy momentum tensor we normally encounter.) 

Confusio appears, well, confused. “Why must the right hand side also contain 03 if the 
left hand side contains 03? Newton’s equation F = ma has 0% on one side but not the 
other.” You explain to Confusio patiently, “We must distinguish between equations and 
identities.? The equations of motion are satisfied only by the actual solutions, while the 
contracted Bianchi identity must be satisfied for any g,,,.” 

Again, this makes physical sense: after writing down some initial T°” (0, x), you can’t 
write down any old g,,,(0, x). You have to figure out how g,,,(0, x) and dog,,,(0, x) are 
constrained by T°(0, X). For Maxwell electromagnetism, you have a similar problem, 
but writing down the charge distribution, you simply sum up the electric field due to 
each charge. Here the constraint equations E°” = T°” are highly nonlinear, and in general 
would require numerical methods to solve. Thus, before you solve the initial value problem 
numerically, you may already have a nontrivial numerical problem to solve just to set up the 
numerical problem! Before we start our lives as adults, we already have to solve problems 
almost as hard as the problems we have to solve as adults. 

By the way, as you recall from chapter VI.5, the contracted Bianchi identity plays the same 
role as the identity 0,,0, F“” mentioned above: it implies energy momentum conservation 
D,T# =0. 

Sometimes the initial data are specified on only a patch on the Cauchy surface. Then 
the Cauchy problem is valid only in a bounded spacetime domain, namely the region that 
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is in causal contact with the initial data. The boundary of that domain is called the Cauchy 
horizon. 


Appendix: Einstein’s confusion 


The astute reader might notice that if we were to regard E and B as the dynamical variables rather than A w there 


would have been no hand-wringing over the initial value problem in electrodynamics. Given the initial values of 
E and B, and the initial positions and velocities of the charged particles, the two Maxwell’s equations 


oe = x B)-J (3) 
aB = = 
<= 7 x B) (4) 


specify how E and B are to evolve and the Lorentz force law tells the charges how to move. The other Maxwell’s 
equations V -E=pandV - B = 0 constrain the initial data. 

Historically, up until 1915, Einstein was confused, because he thought that the metric g uv Was analogous to 
E and B, rather than to A yw He presented an infamous “hole argument” as follows. Given some initial data, 
allow the field equations (which of course he was still searching for at the time) to evolve the metric g,,, to later 
time. Imagine a spacetime region in the future, which Einstein referred to as a hole. Now perform, inside the 
hole, a general coordinate transformation that goes over smoothly to the identity transformation outside the hole. 
Reasoning from the analogous problem in electromagnetism with E and B specified at some initial data, Einstein 
was perplexed that the same initial data would lead to two different metrics inside the hole. As mentioned in 
the text, he concluded that either the field equation cannot be covariant under coordinate transformation, or the 
metric is not physical. Eventually it dawned on him that, in the context of the initial value problem, the metric is 
actually analogous to A ie rather than E and B. 

Given our present understanding of physics today, it is difficult to imagine that such a great mind could be 
confused on “so elementary” a point. One possible explanation is that A,, did not come to the forefront of physics 
until the advent of quantum mechanics.® Indeed, luminaries such as Oliver Heaviside’ had thundered that A ii 
should be consigned to the dustbins of history and that physics needed only E and B. 

By the way, now you can understand Einstein’s rueful confession that I quoted in chapter VI.1: “I believed 
that I could show on general considerations that a law of gravitation invariant in relation to any transformation 
of coordinates whatever was inconsistent with the principle of causation . . . errors of thought which cost me 
two years of excessively hard work.”® 


Exercises 


1 Verify by brute force computation that E° does not contain 94. Hint: To save labor, ask Professor Flat for help. 


2 Asin the electromagnetic case, once the constraint E Ov _ Tv ig imposed at the initial time, the time evolution 
equations guarantee that it will continue to hold, which makes physical sense. Verify this. In other words, 
show that 4)(E°” — T°”) vanishes. 


Notes 


1. Currently, a major effort is under way to understand binary black hole merger by a combination of numerical 
and analytical methods. We will touch upon one aspect of this in chapter X.4. 

2. I have benefited from a discussion with T. Jacobson. 

3. See QFT Nut, p. 19. 
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. Ifyou want to mess with nonstandard theories of matter and of gravity, then you would be obliged to examine 
the initial value problem anew. See T. Jacobson, arXiv:1108.1496. 

. Not to belabor a point, but in my experience, there might still be a student who is confused. Tell him or her 
to compare and contrast the equation (x — 1)(x + 1) = 2 and the identity (x — 1)(x + 1) =x?—-1. 

. See OFT Nut, p. 245. 

. Oliver Heaviside, described by his best friend as a “first rate oddity,” was actually responsible for Maxwell’s 
equations in the form we know them today. Self-educated, he never held an academic position. See B. J. 
Hunt, Physics Today, November 2012, p. 48. 

. A. Einstein, Essays in Science, p. 83. 


Recap to Part VI 


Here we finally get to the heart of the matter, or rather, of the field: the gravitational field 
is striving to extremize the scalar curvature. Remarkably, once we decide to regard the 
metric as the dynamical variable, the action governing the gravitational field is uniquely 
determined. 

Arguing by symmetry, and essentially without doing any work, we can write down 
Einstein’s field equation, and with that, start to unlock the expanding universe and the 
secrets of the black hole. 

Surprisingly, or perhaps not so surprisingly, the motion of particles of matter and of light 
around a massive body can be determined by analog problems in Newtonian mechanics. 
The second derivative, not the first nor the third, rules. 

As lifelong students of physics, we have talked about energy and momentum for a long 
time, but finally we know what they are. Energy and momentum are what the gravitational 
field listens to. 


Part VII | Black Holes 


VI | | Particles and Light around a Black Hole 


Ambling around a black hole 


You have surely read popular accounts of black holes, without question one of the most 
fascinating features of Einstein gravity. Indeed, already in the introduction, I gave the 
heuristic Michell-Laplace 18th-century argument that a (spherical) object of mass M and 
radius R is a black hole if 


R<2GM (1) 


You learned in chapter VI.3 that the empty spacetime around a spherically symmetric 
mass distribution is described by 


1 
ds?=—(1- ‘s) dt? 
ee) as 
with the Schwarzschild radius rs = 2G M. As we noted, the radius R of this spherical object 
does not appear explicitly in ds? but only implicitly, as we have indicated here. Since we only 


dr? +r°dQ (forr > R) (2) 


solved Einstein’s field equation R,,,, = 0 in empty spacetime, the Schwarzschild solution 
(2) holds only for r > R. 

The Schwarzschild radius of an ordinary massive object, the sun for example, is much 
less than its characteristic size R and so would be located inside the object, where the 
Schwarzschild solution is not relevant, as was already mentioned in chapter VI.3. Thus, 
we don’t have to worry about the apparent singularity in ds* at r = rg for stars and planets 
and almost everything else. 

By the same token, we now recognize that the prescient Michell-Laplace criterion (1) 
amounts to saying that if a massive object is so compact that its actual size R is smaller 
than its Schwarzschild radius, itis a black hole. In other words, an object small for its mass, 
or equivalently, massive for its size, is a black hole. For a black hole, the surface defined 
by r = rs, known as the horizon, is situated outside the black hole in empty spacetime, 
where the Schwarzschild solution certainly holds. As we will see in detail in this chapter 
and the next chapter, if an intrepid explorer reaches the Schwarzschild radius of such an 
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object and crosses into the region where r < rg, he or she can never get back out to spatial 
infinity. In short, for our purposes here, a black hole is defined as a massive object with 
an accessible horizon. 

The mystery of the black hole also deepens with the realization that spacetime is perfectly 


smooth there (as indicated by the behavior of R_,,,,, R“”””, for example). In the next chapter, 


HvAp 
we will confirm our suspicion that the singularity in the metric at the horizon is merely 
due to a poor choice of coordinates. 

In this chapter, we focus on the motion of massive and massless particles around a black 


hole, leaving various other issues for the next chapter. 


A common misconception about black holes 


A common misconception is that, around a black hole, an irresistible mysterious force 
sucks everything in.* But in fact, physicists do not know of any additional force besides 
the ones they usually enumerate. Gravity is gravity, and whether outside a regular star or 
a black hole, we have the very same Schwarzschild metric. Thus, even if some evil power, 
in an implausible sci fi movie, somehow manages to turn the sun suddenly into a black 
hole, our earth, though deprived of its main energy source, would still calmly cruise along 
the same orbit. What is true, as we will see in this and the following chapter, is that near a 
black hole, spacetime can be warped so much that once trapped, even light cannot escape. 

We will start by studying the motion of massive and massless particles, such as plan- 
ets and photons, around a black hole. Indeed, in chapter V.4, we already worked out 
the equations of motion for both particles and light in a general spherically symmetric 
static spacetime described by two unknown functions A(r) and B(r). After we found the 
Schwarzschild metric, all we had to do was plugin A(r) = 1— “s and A(r)B(r) = 1, which 
was precisely what we did in chapter VI.3 to study the deflection of light and the perihe- 
lion shift of Mercury. So we are all set and ready to go. Here, for convenience, I will list 
again the relevant equations, which, I emphasize again, are precisely the same around a 
black hole as around any other massive object. The only difference is whether or not the 
equations are relevant all the way down to rs. 


An unfamiliar Newtonian potential 


We first study the case of a massive particle moving in the equatorial plane, with the two 


: dt _ € dg _ 1 
conservation laws < = Ls and 5° = -5, where « and / denote the energy and angular 


momentum per unit mass of the particle, respectively. The motion of the radial coordinate 


is governed by 


2 2 2 

dr rs I l*rs 2 

+ =€ 1 3 
(<) r r2 3 (3) 


* Mark Twain allegedly said that the trouble with people is not that they know so little, but that what they know 
is largely not true. 
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Figure 1 The potential V(r) as a function of = for 


2 
(+) = 0, 1, 3, 5, 6. Note that on the scale of this 


Ts 


plot, the minimum of V(r) is hardly visible. 


As already mentioned back in chapter V.4, we merely have to solve a Newtonian (except 
that coordinate time has been replaced by proper time) problem in the potential* 


2 2 a} 
Vn)= ae 2 Ee. (2)+(4) {(2) (2) (4) 
r r 1 is i I's r r 


The second form shows that if we measure r and / in sensible units, the potential is 


surprisingly simple, controlled by the single parameter (4)’. 

After all the Riemannian geometry, with space and time unified and curved this way and 
that, the bottom line boils down, remarkably enough, to a “pretend” Newtonian problem. 

Life is sweet then: as long as you have mastered Newtonian mechanics, you can blast (3) 
any which way you like, and many texts fill page after page with exhaustive and exhausting 
studies of the resulting equations. 

The first two terms in the potential V are our old friends, representing gravitational 
attraction and centrifugal repulsion, respectively. The third term, call it the Einstein term, 
represents a novel and unfamiliar effect. In the real Newtonian problem, for / = 0, we fall 
into the singularity at r = 0, but for / 4 0, the centrifugal term keeps us from falling in 
(as we discover every morning that the earth is still going around the sun). In contrast, 
in the pretend Newtonian problem, for small r, the Einstein term CS) kicks in and 
totally dominates the centrifugal term. Under the right circumstances, we could fall into 
the singularity even with / # 0. 

Let us plot V(r) as in figure 1, (4) = 0, 1, 3, 5, 6. For] = 0, we have simply V(r) = —‘8, 


but as we increase /, the potential soon develops a minimum and a maximum. Our revered 
Newton taught us how to find them: set V’(r) = 0 to obtain 


rer? — 22r + 30*rg =0 (9) 
with the solutions 


[2 3r2 [2 are 
Fin = — {1+ f1-52 and rmax=— | 1-f1- = (6) 


2 
* Note that in this “pretend” Newtonian problem, I define the “kinetic energy” (4) without the usual 


factor of 5 to remove various factors of 2 in (VI.3.14) and (VI.3.15), rendering later expressions somewhat cleaner. 
+ Again, we have Pythagoras to thank. 
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The critical value for these extrema to appear is (z)’ = 3. The shape of V(r) differs 
according to whether / </. or / > /,. I have intentionally omitted some of the labels in 
figure 1. So, which curve in the figure corresponds to /, = /3rs? Note also that rin 


approaches ~ 2/*/rg for large / and that the minimum is very shallow. 


Radial plunge 


For / </,, the particle has no option but to fall in. 

Consider the simplest case of an intrepid observer plunging into the black hole along 
the radial direction, so that dp = 0, which by 5% t= 4 + corresponds to / = 0. To simplify 
further, let the observer start with vanishing energy at r = 00 (so that $4|,_., = 1, which 


by $= as unpls € = 1). Then (3) implies i(a\a. which literally anybody could 


vere te give r _ 7S (3)r2 T, with rp the observer’s position at t = 0. 

The point is not our ability to solve a differential equation but that the observer reaches 
the horizon rs at some finite proper time starting at some rg > rg (a fact we can see directly 
from the differential equation without integrating). Not only does the observer suffer no 
harm as he crosses the horizon, he also gets there soon enough according to his clock. 
After he passes the horizon, he eventually reaches the origin r = 0, also in finite proper 


time, at which point he is crushed by infinite tidal forces as measured by (recall chapter 
12r2 
V1.3) REY? Ruyog = ee 
Nevertheless, even though nothing appears to be singular at the horizon, the equation 
dt _  € 
dt 4_38 


2 
time t — oo. Our observer crosses the horizon at infinite coordinate time. To his friend 


indicates that something strange does occur there: as r > rg, the coordinate 


stationed at r = 00 (“stationed” means that his friend has to fire small rockets occasionally 
to avoid slowly falling toward the black hole) the observer appears to approach but never 
quite cross the horizon. (The time experienced by the friend, namely her proper time, 
coincides with coordinate time, since for her, $4 = 1,) 

A small puzzle here for you: what happens to ¢ as the observer crosses the horizon? 

Recall our analysis of the gravitational redshift in chapter V.4, showing that the proper 
time interval between the two signals as seen by the receiver is related to the proper time 
interval as seen by the emitter Atp = ATge(go9(rp)/ S00("g))2- So indeed, for the observer 
stationed at r = oo, the interval between signals sent by the infalling emitter gets infinitely 
time dilated by the factor 1/,/1— = as he approaches the horizon rg — rs. 


Orbits with substantial angular momentum 


By substantial angular momentum, I mean / > J, = /3rs, so that V(r) has a maximum 
and a minimum. As shown in figure 2, we now have three possible cases, depending on 
the effective energy (€? — 1) as per (3): 


1. For! (€? — 1) > V(’max), the particle sails over the top of the potential and thus spirals into 


the black hole, as shown in figure 2a. 
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(c) 


Figure 2 For / > 1, = /3rs, there are three possible types 
of orbit (shown at right), depending on the value of the 
effective energy €* — 1. The potential V(r) is plotted 
schematically at left (compare figure 1) to emphasize 
its two extrema. (a) Effective energy «7 — 1> V(rmax); 
(b) V(rmax) > €2 — 1> 0; (c) O> €? —1> VUmin)- 


2. For V(rmax) > (€% — 1) > 0, the particle bumps into the potential barrier and retreats back 


to infinity. The shape of the orbit is shown in figure 2b. 
3. For?0 > (e€* — 1) > V(rmin), the particle is trapped in the potential, and follows an “elliptical” 
orbit with a shifting perihelion (figure 2c). 


If you want, you can work out the shape of these orbits by solving r(z) and g(r): just a 
matter of showing off your ability to solve differential equations. You could always integrate 
them numerically on a computer. 


Circular orbits and Kepler 


Looking at V (r), we see that there are two circular orbits with radius given by (6). The orbit 


2 
at max = " (1 —y/1- #), perched at the maximum of V(r), is obviously unstable. Any 


perturbation will either cause the orbiting particle to fall into the black hole or to move 


: 2 3rd : : 
toward the stable orbit at rin = e(1 +,/1— 7). In contrast, the stable circular orbit 


lives comfortably at the minimum of V(r). 
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Interestingly, Kepler’s third law continues to hold, as we now show. The 4-velocity V4 = 
(V', 0,0, V%) of the planet, or whatever, is given by the conservation laws (V.4.13-14): 
Vi = 4 ~€/(1— 8) and V? = & =1/r?, with the two conserved quantities / determined 


by (5) 


and e determined by (3) 


rs\* 3rs\ + 
é=vin+1=(1-5) (1-3) 
r 2r 


(Here various quantities are to be evaluated at the radius of the orbit.) 
Define the angular velocity Q = = ae / fh = V/V". Note that Q is defined in terms 


=a 
of coordinate time, not proper time. Plugging in what we had, we obtain* 
r GM 
= 5S = 7 Z 
2r3 r3 ( ) 


Kepler’s third law survives Einstein. 


Accretion disks: Mightier than nuclear fusion 


2 
As | decreases toward |, = /3rg, the stable orbit with radius rin = e(1 ap aL #) 


keeps shrinking until r»i, reaches its minimum value of 

TIsco =Tmin|,_) = 3rs (8) 
The orbit with radius rjgco is known as the innermost stable circular orbit in relativistic 
astrophysics. Note that it sits well outside the Schwarzschild radius. 

These simple remarks underlie the essential physics of accretion disks around black 
holes. Around a black hole, infalling debris, consisting of matter from a companion star, 
for example, forms a disk. Through dissipative processes, such as collisions between 
particles and electromagnetic radiation, particles in the disk gradually lose energy and 
angular momentum and move inward until they reach rjgcg. From there, further loss of 
angular momentum will cause them to fall in. By angular momentum conservation, this 
process will cause the black hole to rotate, so that eventually the Schwarzschild solution is 
no longer adequate. In chapter VII.5 we will discuss rotating black holes. 

Black holes power some of the most spectacular processes known to astrophysics. What 
fraction of the rest energy mc? does a particle lose as it crashes inward to the innermost 
stable circular orbit? As always, see if you can figure it out before reading on. 

What is the energy of a particle in the innermost stable circular orbit? Using (4), we 


evaluate V(r =rjsco, ! =I.) = -1 +2 — 3 =-1. Setting “ =0 in (3), we find «= 
3° 9 27 9 8 dt 


* A quick reminder of Kepler’s third law in Newtonian physics: v?2/r = GM/r?, Q? = ((2x)/(2ar/v))? = 
GM/r°. 
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1-—; a But from ae =; as , we see that the particle started out at r = co with 


ea, nae ae fraction of energy ultimately lost to electromagnetic radiation amounts to 
1— 22 ~ 0.06. 

Is this a lot? What do we compare the 6% to? Consider nuclear fusion, which powers 
the stars and the hydrogen bomb. Four protons are converted to a helium nucleus accom- 
panied by the emission of two positrons and two neutrinos. The fraction of energy release 
is calculated easily by comparing the mass of the helium nucleus to 4 times the mass of 
the proton, according to what we learned back in part III. The fabled equation E = mc?, 
yes! It turns out that in nuclear fusion, the energy released amounts to 0.7%. The black 
hole is almost ten times more efficient. Later, we will see that a rotating black hole is even 
more efficient! 


Massless particle 


For a massless particle, we have to replace the proper time t by the affine parameter ¢. 


Then, for a photon moving in the Savane plane, we have (in parallel with the case of a 


€ ae = I 


massive particle) a= =is and 5 


coordinate of the worldline ee 


2 2 
1 (dr 1 rs € 1 
? & ¥ ( *) Peo °) 


From the displayed equations, we can see explicitly our freedom to scale the affine param- 


“z, with € and / two integration constants. The radial 


eter by ¢ > ¢//. As explained earlier, physics does not depend on ¢€ and / separately, but 
only on the impact parameter squared: b? = as 


Alternatively, eliminate the affine parameter ¢ by dividing 4 wel ae te ee , so that 


(2) 8 : 


(which we used in discussing light deflection) or by dividing 41 ; fae = © so that 


(#)'= (r = (1 a (r rs) (11) 


(which we used in discussing radar echo delay, in chapter VI.3). 


The qualitative features of light moving around a black hole can be ee read off 
pigt (9), which with a particular choice of the affine parameter becomes (= ry +U(r)= 
a Plot the effective potential 


1 
UM=5-5 (12) 


~ 


For large r, U(r) = 5 and goes up as r decreases, reaching a maximum value of Uma, = 
4/(27rg) at r = 3rs/2, then plunging downward to —oo. Thus, the motion of light can be 
divided into three cases (figure 3): 
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Figure 3 For light moving around a black hole, there 
are three possible types of orbit (shown at right), 
depending on the impact parameter. The potential U(r) 
is plotted schematically (left). (a) Large impact parameter 
1/b? < Umax: (b) small impact parameter 1/b? > Umax’ 
(c) impact parameter just right 1/b? = Umax. 


1. Large impact parameter 1/b* < Umax: light comes in and then goes back out, corresponding 
to the deflection of light we studied in chapter VI.3. 


2. Small impact parameter 1/b? > U,yax: light comes in and plunges into the black hole. 
3. The impact parameter is just right (1/b* = Umax) to trap light going around in an (unstable) 


circular orbit. 


The precise shape of the orbit can be obtained by integrating (10). 


A common confusion about plunging into a black hole 


Confusio speaks up: “I have learned that the fundamental laws in classical physics (and also 
quantum physics) are time reversal invariant,’ that is, they are unchanged upon t > —1. 
I read that if we take a movie depicting a microscopic* process and run it backward, the 
reversed process must also be allowed by the laws of physics. So why can’t I run the film 
of the observer radially plunging into a black hole and watch him come flying out?” 


* Thus excluding processes involving a macroscopic number of particles, with the attendant discussion about 
entropy, second law, and so on and so forth. Indeed, you can’t make an egg out of an omelette. 
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Well, well, that Confusio is more astute than we think. Indeed, the Lagrangian 


la) 2 Gy 2G) ea) 


governing the motion of a particle in Schwarzschild spacetime is manifestly invariant 


under t + —t. So where is the catch in the standard arguments* about time reversal 
invariance? 

The catch, as I have already mentioned, is that the coordinate time t increases to +00 
as r > rg and then decreases from +00 after the observer crosses the horizon. Indeed, as 
is evident from the Lagrangian just displayed, t and r exchange roles for r < rs. The letter 
“t” no longer denotes time! Much more on this in the next chapter. 

The standard arguments about time reversal invariance work perfectly well as long as 
r > rs. Thus, if we could somehow install a trampoline at rf just outside the black hole, 
the observer in radial plunge could bounce back? out to r = 00, retracing his trajectory. 


Appendix: Painlevé-Gullstrand coordinates 


An interesting set of coordinates was introduced by Paul Painlevé* and Allvar Gullstrand? in 1921. In the 
Schwarzschild metric ds* = —(1— v2)dt? 4 aa dr? + r2dQ with v2 = Ss, let dt = dT — h(r)dr. We could take 
h(r) to be any reasonable function of r, but a particularly nice choice is to make the coefficient of dr? equal to 1, 
which fixes h = aa" We obtain 


2 
ds? = —dT? 4 (a [Sar) + r2dQ (13) 
r 


This shows conclusively that spacetime is not singular at r = rg, a point that was not widely appreciated until the 
early 1960s. (More on this in the next chapter.) Recall from the text that an observer in radial plunge starting at 


rest at infinity follows 4 = —,/ “S (See exercises 3 and 4.) Note also that in Painlevé-Gullstrand coordinates, a 


dt 
slice of spacetime at fixed T corresponds to flat space. 


Exercises 


1 Show that in the derivation of Kepler’s third law, the results we obtained for V’ and V® satisfy the consistency 
check g,,V"“V" = —1. 


2 Determine the shape of the orbit of light in the presence of a black hole. 


3 Verify that in Painlevé-Gullstrand coordinates, an observer in radial plunge starting at rest at infinity follows 


dr =-,/ ‘ss, as must be the case, since in transforming from the Schwarzschild coordinates, we did not 
dr __ /'s 
change r. Show also that $- = = 


* Twice the prime minister of France. 
+ Nobel Laureate in Physiology or Medicine 1911, he opposed giving Einstein the Nobel Prize for his theory 
of special relativity. 
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4 The preceding result shows that the velocity -_ of the radially plunging observer reaches 1 at the horizon 
and then goes to oo at the physical singularity at r = 0. You are of course sophisticated enough by now not 
to dash off some crackpot claim that Einstein was wrong about the speed of light setting the ultimate speed 
limit. Verify that x is still less than the speed of light. 


Notes 


1. For the record, V (rmax) = 


h(h—./—3h—-2) nee ( ; y 
(Ji—3yh—-n)” : Ars} * 
h(h+./G@—3)h—2) 
(n+ G3) : 


3. The only known exceptions occur in the decay of certain elementary particles. 


2. Again, for the record, V (rin) = — 


4. Irecommend J. J. Sakurai, Invariance Principles and Elementary Particles, chapter 4. 
5. As I mentioned in chapter VI.3, Einstein did think erroneously that a particle falling into a black hole would 
bounce back at the Schwarzschild radius. 
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Poor choice of coordinates 


In the Schwarzschild solution, which I reproduce here for convenience, 


ds? = — 1-8) ar : 
pag) aa 


the components of the metric go) = —(1— “S) and g,, = (1- rs) change sign at r = 


dr? + r?dQ? (1) 


rs =2GM, a place known as the horizon, as was mentioned in the preceding chapter. 
(The reason for the term “horizon” will become clear in this chapter.) What we call time t 
and what we regard as the radial coordinate r exchange roles* inside the horizon, leading 
us to expect that something extraordinary happens at the horizon. However, we learned in 


chapter VI.3 (by computing R,,,,,R””"”, for example) that spacetime is perfectly smooth 


vAp 
there. We suspected that the singularity at the horizon was merely due to a nasty coordinate 
choice (recall appendix 1 of chapter 1.6). In this chapter, we will confirm this suspicion by 
exhibiting a better behaved coordinate system near and inside the horizon. 

Let’s look at radial light rays, for which do = 0 and dy = 0, so that we can effectively 
suppress the ache coordinates. Their paths are determined, as usual, by ds? = 0 = 


(ba Bde aaa", that is, by dt = togdr= 45 


describes Saigon light rays (dr > 0 for dt > 0), and the minus sign incoming light rays 


the plus sign 


(dr <0 for dt > 0). Infinitely far away from the black hole, dt = +dr and light rays move at 
45°. But as we move in close to the black hole, the angle the light rays make with the r-axis 
increases with decreasing r until it reaches 90° at the horizon, as illustrated in figure 1. 
The light cone closes up like a clam. 


* Because of this role exchange, it has been suggested that the region r < rg be called not “inside the horizon” 
as it is usually called, but “after the horizon.” By the same token, the Schwarzschild singularity at r= 0 is a 
moment in time, not a place you visit. 
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Figure 1 The light cone closes up like a clam as r decreases toward the horizon. 


Tilting light cones spill particles into the black hole 


Let’s look for a better set of coordinates than (t, r) by massaging the Schwarzschild solution 
as follows: 


= (—*) (a+ E ar) (a z ar) + dn 
r r—rs r—TIrs 


= (‘ — ) (di + dr) (a- Z **Sar) +ran (2) 


r 


where we have defined dt = dt + =F dr. (We could easily integrate this to determine f(t, r) 


up to an irrelevant additive constant, but it is not needed for our purposes here.) Note that 
the angular part of the metric is just going along for the ride. 
We now show that the coordinates (f, r) are more suitable than (t, r) for describing 


outgoing and incoming radial light rays. In the (f, r) plane, the radial light rays follow 
r+r 


di +dr =0 (incoming, since dr < 0 for dt > 0) or dt = Sdr (outgoing for r > rs, since 
then dr > 0 for dt > 0). Thus, the incoming light rays always move at 45°. In contrast, 
the angle the outgoing light ray makes with the r-axis varies with r, starting out at 45° for 
r > rg far from the black hole, slowly increasing with decreasing r until it reaches 90° at 
the horizon, as shown in figure 2. 


The light cone gradually tilts over. Once r < rg, the outgoing light ray no longer deserves 


the name “outgoing”: the relation df = sd r=— as dr now implies that dr and dt have 
opposite signs, so that as f increases, r decreases. See figure 2. 
The extraordinary feature is that for r < rs, we no longer have any outgoing light rays! 
All light rays are ingoing. A fortiori, material particles cannot escape, since their world- 
lines have to lie inside the light cone. Inside the horizon, the tilting light cone appears to 


“spill” the material particles contained inside toward r = 0. 
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Figure 2 In the (f, r) plane, the light cone gradually tilts over. Inside the horizon, the tilting light 


cone appears to “spill” the material particles contained inside toward r = 0. 


As already remarked, invariant measures of spacetime curvature behave smoothly at the 
horizon, and spacetime appears to be perfectly normal. What changes at the horizon is the 
causal structure of spacetime, as we will see in more detail shortly. The (f, r) coordinates 
make clear that inside the horizon, light rays and particles can perfectly well reach r = 0; 
the closing up of the light cone in the (t, r) plane as r > rg merely shows the inadequacy 


of t as a coordinate. 


Eternal versus actual black holes 


Our description here is as if somebody manufactured a black hole a really long time ago, 
somehow, and placed it at r = 0 at the beginning of time t = —ov. But the universe started 
in a Big Bang. Thus, as a description of a black hole, the Schwarzschild solution in its 
entirety, including the region inside the horizon, represents a mathematical curiosity, 
not an actual physical situation. In contrast to this so-called eternal black hole, an actual 
physical black hole represents a possible final state of stellar evolution. (If you don’t know 
this, I will touch upon this fascinating story! briefly in chapter VII.4.) 

A realistic description of black hole formation can be enormously complicated, but in 
theoretical physics, the process is often idealized as a spherically symmetric cloud of dust 
collapsing. The technical term “dust” refers to a collection of particles, each following a 
geodesic, that do not interact with one another directly. The worldline of a particle on the 
surface of the dust ball is indicated schematically in figure 3a, which shows the interior 
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Figure 3 A dust ball collapsing into a black hole. (a) The worldline of a particle on the surface of the dust ball; 


the interior of the dust ball as a shaded region as shown in the (f, r) plane. (b) The formation of a black hole, 
with the angle 6 suppressed. 


of the dust ball as a shaded region. (Strictly speaking, inside the dust ball, the f coordi- 
nate may be inappropriate and so should be used only after the formation of the black 
hole.) As usual, this depiction in the (f, r) plane is (1 + 1)-dimensional, with 6 and g sup- 
pressed. In figure 3b, we show the same process in a (2 + 1)-dimensional depiction, with 6 
suppressed. 


An escape attempt that barely failed 


Imagine that after the formation of the black hole, a spherical shell of matter centered at 
the origin comes crashing into the black hole. In other words, the dust ball was actually 
enclosed by a spherical shell of dust. The collapse of the shell increases the mass of the 
black hole from M to M + AM, witha corresponding increase of the Schwarzschild radius 
from rs to rg + Ars. See figure 4. The original horizon and the new horizon are indicated 
by the dashed and solid lines, respectively. 

Now consider a light ray emitted from inside the dust cloud, thinking to itself, “Phew, 
I’m going to escape from this black hole!” but then just barely getting trapped by the more 
massive black hole. This story of a barely failed escape attempt makes clear that the horizon 
should be thought of as a surface formed by light rays in the (f, r) plane moving “vertically,” 
that is, at an angle of 90° with the r-axis, in other words, moving along the line r = rs. It 
is a null surface (as first defined in chapter III.3). 
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Figure 4 After the formation of a black hole, a spherical shell of 
matter centered at the origin comes crashing in, thus forming a 

more massive black hole. A light ray emitted from inside the dust 
cloud that would have escaped from the less massive black hole is 
now trapped by the more massive one. 


Light rays moving at 45° 


In the (f, r) coordinates, ingoing radial light rays always move at 45° from the vertical, 
suggesting to us that it might be nice to have outgoing radial light rays also move at 45° from 
the vertical. Instead of this light cone that tilts as we approach the horizon, we would have a 
fixed light cone, just as in Minkowskian spacetime. Kruskal* and Szekeres independently 
found? the desired coordinates. In my experience, students are often confused, and so I 
will go at a perhaps excruciatingly slow pace. 

Once again, look at the second line in (2): ds* = —(—"8) (dt + dr) (dt — +~dr) + 


r r—lrs r=Pg 


r?dQ?, practically begging us to define 


r 


dp=dt+ dr and dq=dt-— dr (3) 
Pots r—rs 
Then 
d= (‘ . ) dpdq + rd? (4) 


with r to be regarded as a function of p and q. Indeed, d(p+q)=2dt and 
d(p — q) = “ydr = 2(1+ —S)dr, so that p + q = 2t and p — q = 2r + 2rs log Irs! 


r—-rs r—rs 's 2 
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with a convenient choice of integration constants. Note the need for the absolute value: 
the integral of 1/x is log |x|, not log x! 


For r > rs, we recover Minkowski spacetime, of course, with dp, dq > dt tdr. 
This suggests yet another change of coordinates: P = e?/2"s and Q = —e~4/*s, so that 
(4) becomes 


4r3 
ds? = os sign (r — rs) dPdQ + r°dQ? (9) 


The appearance of the sign function stems from the appearance of the absolute value. The 
only singularity is now at r = 0, which we know to be physical. (Note that while p and q 
have the same dimension as r, the coordinates (P, Q) are dimensionless.) 


Had we sloppily neglected the absolute value in integrating oe : and thus omitted the 
sign function in (5), we would have been tempted to write V = 5(P + Q),U= 5(P —Q), 
so that dV? — dU? =dPdQ. But being careful, we see that if we want to have the nice 
form 

4 3 
ds? =—— S$ o-1*s (av? — au?) + ras? (6) 
r 
we have to require dV* — dU* = sign(r — rs)d PdQ with the sign function. We will deter- 


mine V, U in a minute, but for now, let’s admire (6). 


Radial lightrays are determined by dU = tdV. We have accomplished our goal of having 
light rays move always at 45° from the vertical, so that material particles always move at 
less than 45° from the vertical. Note that V is always the timelike coordinate; none of this 
funny business that t sometimes denotes a timelike coordinate and sometimes a spacelike 
coordinate. 

Most of all, we see that, ta dah, the metric is not singular at all* at the horizon r = rg, 
but still singular at the origin r = 0, as it should be. 

But as some of you know, and may even know very well, there is no free lunch. The 
coordinate singularity at the horizon cannot simply vanish into thin air. Where is it? The 
answer will be revealed in appendix 4 if you can’t figure it out in the mean time. 


Kruskal-Szekeres coordinates 


The requirement dV* — dU? = sign(r — rs)dPdQ indicates that we should define, for 
r>rs,V=3(P + Q),U=4(P — Q), andforr <rs, V = 5(P — Q), U = 4(P + Q). Note 
the interchange between V and U outside and inside the horizon. 

It is now simple and straightforward to determine V and U in terms of ft and r in the 
two regions. First, 


V7-Ue= sign(r — rs)PQ = — sign(r — ree P—O/?'s 


= — sign(r — rs)(\r — rsl/rs)e/5 = (: = “) ell" (7) 


rs 
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The sign function disappears. (Also, note the factor (1 — = (rs —r)/rs, not (1— 8) = 
(r — rs)/r as in the Schwarzschild metric!) 

Second, for r>rs, V/U =(P + Q)/(P — Q) = (et O/?"s — 1) /(ePt 9/278 4 1) = 
tanh oe while for r <rs, V/U =(P — Q)/(P + Q) =coth ame Evidently, the relation of 
the Kruskal-Szekeres coordinates (V, U) and the Schwarzschild coordinates (t, r) depends 


on the sign of (r — rs). 
t 
Brg? 


r ag t r ue t 
v=(4-1) e”/?"S sinh (+) , u=(=-1) e”/?"S cosh (+) (8) 
lg 2rs ls ars 


Inside the horizon, that is, for r < rs, since V/U =coth a we have 


r\v2 t r\v2 t 
V= (1 - “) e”/?"S cosh (+) , U= (1 - “) e”/?"s sinh (+) (9) 
rs ars 's 2rs 


Note that the factor C —1)'/? in one region and (1 — es 2 in the other are both real, 
otherwise (8) and (9) would not make sense. 


Outside the horizon, that is, for r > rs, since V/U = tanh +, we have 


Kruskal-Szekeres diagram of the Schwarzschild black hole 


We can now describe spacetime around a black hole using (V, U) coordinates. See figure 5, 
known as a Kruskal-Szekeres diagram. 

Lines of constant ¢ correspond to straight lines with some fixed slope as given by 

V/U = (rs) tanh 5 +0 (rs —1) coth 5 
as plotted in figure 5a. (The step function is defined as usual by ©(x) = 1 for x > 0 and 
O(x) = 0 for x < 0.) 

From (7), we see that the lines of constant r correspond to hyperbolas in the (V, U) 
plane, “vertically oriented” hyperbolas for r > rg and “horizontally oriented” hyperbolas 
for r < rg. In particular, the physical singularity at r = 0 corresponds to the horizontally 
oriented hyperbola V = +V/U? + 1, with the plus sign mandated by (9). The horizon at 
r =rs = 2M degenerates into the two straight lines V = +U, as we could have deduced 


from the fact that at the horizon, vertically oriented hyperbolas transition into horizontally 
oriented hyperbolas. 

From the Kruskal-Szekeres diagram, a small puzzle that might have bothered you 
in the radial plunge discussion in the preceding chapter, namely what happens to 
t after the observer passes the horizon, resolves itself. As he approaches the horizon, t 
increases, reaching oo when he reaches the horizon, and after he passes the horizon, 
t decreases from oo. By now you know better than to claim, as some crackpots do, that he 
gets to live his life backward: it is proper time that counts. 

The moral of the story is that no single coordinate system is perfect. You would not 
want to calculate the perihelion shift of Mercury using the (V, U) coordinates; the (t, r) 
coordinates are clearly superior. 
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Figure 5 Kruskal-Szekeres diagram of the Schwarzschild black hole. (a) Note 
the lines of constant t and r. (b) As indicated by the dashed line, an observer 
falling into a black hole can escape before reaching the horizon V = U. 


A common misconception alert! The lines of constant r are not geodesics. To hover at 
constant r outside a black hole requires constant and careful firing of a rocket pack strapped 
to your back. (Keep in mind also that the angular coordinates 6, g are suppressed, and so 
each point in figure 5 corresponds to a unit sphere.) 
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However, the observer indulging in the extreme sport “radial plunge” is following a 
geodesic, as shown in figure 5b. Note that the angle the curve makes with the vertical has 
to be less than 45° at all points along the curve. As you can see, his worldline will eventually 
end at the hyperbola V = +/U2 + 1, where infinite tidal forces await him. 

Observers in theoretical physics, however, are endowed with free will. Dear reader, as a 
“young observer,” you could always strap a rocket pack on your back and fire it whenever 
you wish. As indicated by the dashed line in figure 5b, you can always escape from the 
black hole by firing your rocket pack before you reach the horizon V = U. But once you 
pass the horizon, then no amount of firing would allow you to come back out. Indeed, even 
light traveling along the 45° lines (V = U + positive constant) will eventually also end up 
(as indicated by the dotted line) at the physical singularity (as indicated by the jagged line), 
so that you can no longer send signals to your friends outside the black hole. 

The gravitational time dilation discussed earlier also becomes clear pictorially. The 
infalling observer sends off signals at regular proper time intervals, as indicated by the 
wavy lines in the figure. As you can see, for the observer hovering at some constant r 
outside the black hole, these signals arrive with ever increasing intervals between them. 
It is worth emphasizing again that, once fallen through the horizon, the observer is not 
in any way obliged to follow a geodesic. He could certainly fire his rocket pack and zip off 
this way or that, frolicking inside a black hole, as long as his worldline makes an angle of 
less than 45° with the vertical. 

The Kruskal-Szekeres diagram is drawn for an eternal black hole. For an actual physical 
black hole, the Kruskal-Szekeres diagram is physically relevant only to the right of the solid 
line, which could also be taken to depict the geodesic of a particle on the surface of the 
collapsing star. 


Penrose diagrams 


As this discussion makes clear, it is really advantageous to have radial light rays always 
move along 45° lines. To this, Roger Penrose added another attractive feature of having the 
range for the coordinates be finite. The resulting spacetime diagram is known’ as a Penrose 
diagram and is extraordinarily useful for seeing the causal structure of the spacetime. 

To see how this works, consider the easiest case of Minkowski spacetime ds* = —dt* + 
dr? + r*dQ?. Write p=t+r,q=t—r (namely, go to the light cone coordinates men- 
tioned in chapter III.3). Then ds* = —dt? + dr? + r2dQ? = —dpdq + r7dQ?. (Note that 
our convention is consistent with the r, > 0 limit of what we had in (3) and (4).) Since 
t ranges over (—oo, oo) and r over (0, 00), Minkowski spacetime covers the half plane 
r > 0, each point of which corresponds to a sphere described by the suppressed angular 
variables 6 and g. Lines of constant p correspond to t = —r + p, and of constant qg to 
t=r-+q; the (p, qg) coordinates are just the (t, r) coordinates rotated by 45°. Note that, 
since p — q = 2r > 0, the region (p < 0, g > 0) is not allowed. The half plane is divided 
into three regions, with (p > 0, q > 0), (p > 0, g < 0), and (p <0, g < 0), separated by 
the two lines defined by g = 0 and p = 0, respectively. 
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Figure 6 Penrose diagram of Minkowskian spacetime: the future and past 

null infinities are denoted by Z+ and T~, respectively, the future and past 

timelike infinities by J+ and I~, respectively, and the spacelike infinity by 

1°. Causal relationships can now be determined at a glance. For example, if 
the observer at B wants to send a message to an observer at rest at r = 0, the 
earliest the message could reach her would be at point A. 


Since (p, q) range over (—oo, 00), we could compactify them by a simple change of 
variable p = tan P, q = tan Q, so that (P, Q) range over the finite range (— 5, 5). Again, 
spacetime consists of a triangle bounded by the three straight lines P = 1/2, Q = —1/2, 
and P = Q, divided into three regions, with (P > 0, Q > 0), (P > 0, Q <0), and (P < 
0, Q <0). Finally, we can “rotate back” by writing T = P + Q, R= P — Q. The resulting 
diagram is shown in figure 6. Minkowskian spacetime is represented by a triangle bounded 
by the three straight lines T= a — R, T=—-x2 +R, and R=0. 

Light rays (or null lines) propagate at 45°, and so the future and past light cones are 
easily drawn. (For example, the future light cone of the observer at B is indicated by the 
dashed lines.) 

As indicated in the figure, it is customary to denote the future and past null infinities 
by Z* and T~, respectively,* where null lines originate and end up; the future and past 
timelike infinities by /+ and I~, respectively; and the spacelike infinity by /°. Keep in 
mind that the angular coordinates have been suppressed, so that /° actually represents 
“the sphere at infinity” often mentioned in physics. 


* In case some people get bent out of shape, I might mention that for null infinity, I use the “calligraphic I” = I 
(following, for example, S. Hawking and R. Penrose,° while many authors use some version of the “calligraphic 
J” = 7 (as defined by the T,X typesetting program). It is a trivial distinction, dependent merely on who your 
teacher was in penmanship class. 
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Figure 7 Penrose diagram of the Schwarzschild black hole. 


Note that the specific “compacting” function tan used to relate a variable (P or Q, for 
example) with a finite range to a variable (p or q, for example) with an infinite range hardly 
matters in the present context. It is the causal structure of spacetime that we are after. For 
Minkowski spacetime, evidently, every timelike worldline will end up at Z* (excepting 
one line that ends up at /*). Causal relationships can now be determined at a glance. For 
example, if the observer at B wants to send a message to an observer at rest at r = 0, the 
earliest the message could reach her would be at point A. By the same token, if the observer 
at B wants to get to r = 0, there is no way he could get there before point A. 

We will have a bit more to say about Minkowski spacetime in appendix 5. 

Nowitis easy to draw the Penrose diagram for Schwarzschild spacetime: we simply bring 
the various infinities in, just as we brought the various infinities in Minkowski spacetime 
in. The result is shown in figure 7. If you want, you could work through the arithmetic 
following the same procedure as for Minkowski spacetime, rotate by 45°, compactify 
variables, and then rotate back by 45°, but there is no point in doing this. 


Sewing spacetimes together 


I now describe the formation of a black hole under the simplest circumstances that 
theoretical physicists have come up with. The black hole is formed by the collapse of a thin 
spherical shell of photons in Minkowski spacetime, all moving radially inward toward a 
point. The description of the formation process involves “sewing” two distinct spacetimes 
together. Roughly speaking, before the collapse of the shell of photons, spacetime is 
Minkowski, while after the collapse, spacetime becomes Schwarzschild. Thus, we have 
to join two distinct spacetimes together. 

Let us start by envisaging the thin spherical shell of photons (a) in a Minkowski space- 
time, and (b) in an eternal Schwarzschild spacetime. These scenarios are depicted in 
figure 8a and figure 8b, respectively, with the shell represented by a double line. These two 
figure panels represent situations in which the shell contains very little energy and has a 
negligible effect on the existing spacetimes, Minkowski in one case and Schwarzschild in 
the other. 


430 | VII. Black Holes 


(c) 


Figure 8 A thin spherical shell of photons, represented by a 
double line, (a) in a Minkowski spacetime, and (b) in an eternal 
Schwarzschild spacetime. (c) After excising the physically 
irrelevant regions from the spacetime in panels a and b, we sew 
together the two physically relevant regions to form one single 
spacetime. Before reaching point H, you could still reach the 
future timelike infinity. But after point H, your fate eventually 
is to meet the singularity represented by the jagged line. 


But now suppose that the shell of light contains sufficient energy to form a black hole. 
Then the Minkowski spacetime above and to the right of the double line in figure 8a is no 
longer relevant: a black hole has formed! That region of Minkowski spacetime should be 
excised. 

The situation is quite different in figure 8b. We are trying to describe the formation of a 
black hole due to an incoming shell of light. Before the shell arrives, spacetime is supposed 
to be Minkowskian. Thus, the region below and to the left of the double line in figure 8b is 
physically irrelevant and should be excised. Also, we don’t think eternal black holes exist 
physically, so in any case, the spacetime depicted in figure 8b should not be taken in its 
entirety. 

So, we cut off the parts of figure 8a and figure 8b that are irrelevant. Let us then “sew 
together” the two physically relevant regions left in these two figures to form one single 
spacetime, as shown in figure 8c. Before the shell arrives, we have flat spacetime, and af- 
terward, a Schwarzschild spacetime containing a black hole. Thus, this cutting and sewing 
construction provides us with a spacetime description of an idealized black hole forma- 
tion process. The resulting Penrose diagram shows a black hole forming in an initially flat 
spacetime, causing a horizon (indicated by the solid line in the Schwarzschild portion of 
the composite spacetime, continued as a dotted line into the Minkowski spacetime before 
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the shell of light arrives) and a physical singularity (indicated by the jagged line) to form.” 
Note that all angles in this figure are at either 45° or 90°. 

This spacetime diagram tells an intriguing story. Suppose you are living a contented life 
at the origin (indicated by the vertical line in the figure, which serves to represent your 
worldline) of a Minkowski spacetime. You have no idea that a monster shell of light is 
coming at you at light speed. Before point H, you could still blast off and reach the future 
timelike infinity denoted by Z*, if your rocket were fast enough. But after point H, you are 
totally doomed: your fate is to meet the jagged line sooner or later. 

This description indicates that the horizon is a global, nota local, concept characterizing 
the causal structure. At point J, after you pass H, no signal from the monster shell of light 
can have reached you yet, and you could well be minding your own business. But already 
it is too late for you! No matter what you do, you would still be headed toward the physical 
singularity. 

Let me emphasize again that the coordinate independent measure of curvature 
RY ae Ruvpo 


taking rs = 2M large, we can make this measure of curvature as small as we like, so that 


= 12rg/r® evaluated at the Schwarzschild radius rs goes like 1/rg. Thus, by 


spacetime around a black hole could appear to be arbitrarily close to everyday flat space- 
time. Nevertheless, the global causal structure of spacetime has been changed essentially 
and irrevocably. 


Appendix 1: Eddington-Finkelstein coordinates 


I describe here a coordinate system first used by Eddington in 1924, and then rediscovered by Finkelstein in 
1958. Speaking loosely, we can think of the Eddington-Finkelstein coordinates as something halfway between 
the Schwarzschild and the Kruskal-Szekeres coordinates. Go back to the second line in (2), which I reproduce 
here for convenience: 


ds? = (=) (a rapt ar) (a J ar) + r2dQ 
r r—rs r—rs 


Define dp = dt + —~dr as before but not dq. In terms of this new coordinate, we have dt — —“-dr =dp — 


Tats. Ts 
= dr. Thus, 
r—Ts 


ds* = ( —s) dp? + 2dpdr + r2dQ (10) 
r 


r-rs 
r 


Radial light rays follow paths determined by solving ( a p? = 2dpdr. Light rays along dp = 0 are always 


ingoing. Light rays along (SS)a p = 2dr are outgoing for r > rg and ingoing for r < rs. 


Recall that df = dt + —S-dr. Be careful to distinguish df from dp! In fact, dp = di + dr, so thatthe dp = Olight 


r—rs 


rays are just the dt + dr = 0 light rays discussed earlier. We can also integrate dp = di + dr =dt 4 (1 + ee Jar 
Ir=rs| 
‘ys * 


to obtain p=f+r=t+r+rslog 


Appendix 2: Area of the horizon 


We ask Confusio, “What is the dimension of the horizon ofa black hole?” He responds, “Let’s see. Setr =rg + €in 
the Schwarzschild solution (1) to get ds? ~ zat? — r2dQ? (since dr = 0). Sure looks like it’s 2 + 1 dimensional.” 
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Good. As we've noticed, Confusio is getting less confused by the day. Indeed, most people think of the horizon 
as a mathematical 2-sphere surrounding the black hole. You add time and it’s 2 + 1 dimensional for sure. 

But now set ¢ = 0. Time goes. Not just time goes by, but time literally goes. We are left with —ds? = rdQ?; 
the actual horizon, contrary to what the naive might think, is actually 2-dimensional and has an area 


A=4nrg = 160(GM)* (11) 


That this 2-step discussion is even necessary points to the deficiency of the Schwarzschild coordinates. If 
we set r = rg in the (p, q) coordinates (4), or V = U in the Kruskal-Szekeres coordinates (6), or r = rg in the 
Eddington-Finkelstein coordinates (10), we obtain immediately —ds* = rZdQ? and the area law (11). 


Appendix 3: Misleading to show the black hole as a funnel or 
as a rubber sheet 


You've probably seen a picture of a black hole depicted as a kind of funnel, or alternatively as a rubber sheet 
depressed by a heavy round mass. Far away from the funnel, or the depression in the rubber sheet, the surface 
is supposed to be flat. I will pointedly not show this picture (you can draw it yourself based on the mathematical 
description given below), but it and its variants have appeared in countless magazines, newspapers, popular 
books, and even on the cover of a textbook. In many science museums, visitors are invited to toss a small ball 
onto the surface of an actual funnel shaped construction. If you toss the ball with sufficient speed in an angular 
direction, it will orbit around the central funnel, slowly spiraling into the dark “bottomless” pit in the center. And 
of course, if you toss the ball in the radial direction, it will fall right in, “sucked in by the irresistible force” of 
the black hole, often thought of as a “source of evil” in the visitor's mind. You know of course that this display 
depicts the sun equally well. 

This museum display entertains the visitors and educates them to some extent, but D. Marolf has pointed out 
that it is misleading at best. For sure, it has seriously confused some students. 

This popular picture and the display that goes with it are obtained by setting t equal to some constant and 
6 = 2/2 in the Schwarzschild metric (1) to obtain 


dr? + r2dy* (12) 


which is then embedded in 3-dimensional Euclidean space E>. (The museum staff could hardly do otherwise.) 
Using the usual cylindrical coordinates (z, p, g) for E 3 we specify the embedding by writing z= f(r), p=r, 
y = g. You can work out f(r) if you want, but it is not necessary here. In science museums, they don’t use the 
actual f(r), but instead, use an f(r) such that f(r) > constant for larger and f’(a) = —oo for some small value 
of a. So you see why I don’t need to draw a picture for you! 

Marolf’s point is that this picture represents a slice in time and is not directly connected to the gravitational 
attraction of the black hole. (The actual force “sucking” the ball into the funnel is of course supplied externally, 
by the earth.) In fact, there are spacetimes with the same t — 9 slice as (12) but with totally different gravitational 
fields as in the Schwarzschild case (as you saw in exercises V.4.4 and V.4.5). 

To obtain a more appropriate representation of the black hole, we should take a slice of (1) with 6 and g both 
constant (in contrast to the funnel picture based on a slice with t and @ constant) and then embed the slice in 
(2 + 1)-dimensional Minkowski spacetime M?'!. The resulting picture contains two flanges. 


Appendix 4: Wormholes and such 


Confusio grumbles, “Did we not cheat? I can see that the metric in the Kruskal-Szekeres coordinates 


4r3 
dpa Bentlts (av? = du’) +242? (13) 
r 
[reproduced here for convenience] is free of the coordinate singularity at r = rs, but the transformation (8) and 
(9) from (t, r) to (V, U) is not smooth as we cross over r = rg (that is, dV/dr, dU/dr are singular at r = rs).” 
Indeed, Confusio is right, but that’s just the law of calculus: if we transform from singular to nonsingular 
coordinates, the transformation necessarily must be singular. Think about it this way. Imagine a civilization far far 
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away in which the metric (13) just fell on the head of a physicist who had never heard of the usual Schwarzschild 
metric written in (t, r) coordinates. Or perhaps more likely, a mathematician simply presented it to a physicist, 
saying, “Lo, behold this metric: it solves Einstein’s field equation in empty spacetime.” (The symbol r, defined 


earlier in (7) by (1 - Let's = V?— U’, isa perfectly well-defined function of V? — U? for 0 < V7 — U? <1) 

What we should ask is how the original spacetime maps into the Kruskal spacetime. Indeed, (8) shows that 
every point (t, r), foroo > r > rg and co > t > —oo, maps toa unique point (V, U) in quadrant I in figure 7, and 
(9) shows that every point (t, r), for r, > r > 0 and oo > t > —oo, maps to a unique point (V, U) in quadrant II. 
So the original spacetime maps only into half of the Kruskal spacetime, namely the half defined by V > —U. The 
crucial question is, what do we make of quadrants III and IV? They do not correspond to anything in the original 
spacetime. For example, take the negative U-axis defined by V = 0, U < 0. Examining (9), we see that this does 
not exist in the original spacetime. Mathematicians say that the Kruskal coordinates define an extension of the 
Schwarzschild solution. 

Physicists, including astrophysicists, typically take the attitude that for an actual black hole formed from a 
collapsing star, a solution to Einstein’s field equation in empty spacetime is relevant only to the right of the 
worldline of a massive particle in figure 5b. In other words, we now think of what we previously called the 
worldline of an observer falling into an eternal black hole as the worldline of a particle on the surface of a 
collapsing star. Quadrants III and IV in figure 7 are physically irrelevant (at least until further discoveries in 
physics*). 

In contrast, mathematicians and speculators can certainly invite themselves (free country, remember?) to 
study the spacetime described mathematically by (13). Looking at (13), we see that, since quadrant II] is related 
to I by U > -U, the two quadrants are the same: III also describes the outside of a black hole, approaching 
an asymptotically flat spacetime. Quadrant IV is more peculiar, with a physical singularity at V= —/U? + 1. 
Classical general relativity cannot tell us anything about what actually happens at a physical singularity except 
that the theory breaks down. We need a theory of quantum gravity. Naively, we can draw lines coming out of this 
physical singularity with light at 45° and so on, capable of propagating into I and III, so that it looks like what we 
might call a “white hole,” whatever that means—a place where particles could come streaming out. Clearly, our 
present understanding of physics does not allow us to say anything meaningful, which of course does not deter 
people from publishing any number of speculative papers. 

It is somewhat interesting to look at the V = 0 slice connecting the two asymptotically flat spacetimes I and 


3 
III and described by the 3-dimensional line element ds? = “3 e-1/"sqU? + r7dQ?. By (8), V = 0 implies that 
1/2 
U= (4 - 1) e”/"s anddU = 


= e’/*"sdr, Inserting into (13), we see that this spatial slice is described by 


r 
2¢ 7 _4)1/2 
me ED 
1 


ics 
; a 


ds? = 


dr? +r7dQ? (14) 


The reader with a long memory would recognize this as the Einstein-Rosen bridge discussed way back in 
chapter I.6, which John Wheeler? picturesquely described as a wormhole. 

The question naturally arises, if an eternal black hole exists somewhere, whether one could get through the 
wormhole to another asymptotically flat universe. Inspection of the Penrose diagram in figure 8 shows that it is 
not possible to get from I to III even if you were to travel at the speed of light. However, an observer starting in 
I, after falling through the horizon into IJ, could receive signals originating from within III. In other words, our 
intrepid observer, while unable to get to III, can look at part of III. 

An important point is that while the Einstein-Rosen bridge can be studied as a static 3-dimensional space, the 
wormhole is a dynamic entity evolving in time. Indeed, let’s take the V = Vo > 0 slice of the Kruskal spacetime. 


(Recall that V is the timelike variable.) From (7), we have U = +,/V3 — (1 - Ler /s. Substituting this and 
V = Vp into (13), we find 


1 


ds? = 
PS “Ss (1— Vee7"/"s) 


dr? + r°dQ? (15) 


We see that the throat of the wormhole, determined by the value of r where g,.. > 00, decreases from rg at Vo = 0, 
approaching 0 as Vy > 1, and reaches 0 when we hit the physical singularity at Vy = 1. The wormhole closes up 
at the physical singularity. By dimensional analysis, since rg is the only dimensionful parameter around, we see 
that the wormhole closes up? on a time scale of the order of rs. 


* I subscribe to this attitude. What attitude you take is of course up to you. 
+ This classical analysis, like the rest of this chapter, completely ignores the increasingly large fluctuations 
due to quantum gravity as we approach the physical singularity. 
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Appendix 5: A bit more on Minkowski spacetime 


In the text, we constructed the Penrose diagram for Minkowski spacetime as follows: rotate (by 45°), compactify, 
rotate back, that is, by the sequential changes of variables* t = 5(p tq),r= 5(p q), p=tan P,q=tanQ, 
T=P+0O,R=P— Q. The metric is transformed as 


ds? =—dt* +-dr? + r7dQ3_, 


= —dpdq + }(p — q)*d2?_, 
1 
~ 4 cos? P cos? 


Di agp? BL ISD 
aK dT? + dR? + R'd2,_,) (16) 


As explained in the text and as shown in figure 6, spacetime consists of a triangle. (Indeed, the factor 
(cos? P cos? Q)~! indicates that the coordinates end at P = 2/2, Q =—1/ 2.) We see that ds? is conformally 
related (see exercise 1.5.14) to d5* = —dT* + dR? + rdQ?_». Recall that R runs between 0 and z, and thus in 
spite of appearances, this spacetime, while conformally related to the flat Minkowski spacetime, is not flat. Space 
consists of the sphere S¢-!, with R= 0 and R=7z corresponding to the north and south poles, respectively. 
Indeed, we might want to rename R as 6, the familiar latitude. (Note that we have generalized slightly from the 
text to consider M41)! with no cost to us; as usual, the angular coordinates just go along for the ride.) 

But if somebody handed us d5* with T and R restricted to the triangular region, we could invite ourselves to ex- 
tend this spacetime outside the triangle: simply let T run from —oo to +00. Without the factor (cos* P cos* Q)~1, 
we can wander outside the triangle in figure 6 with impunity. The resulting spacetime has the topology of 
R x S¢~1 and is known as the maximal extension of the Minkowski spacetime we started out with, which now 
corresponds to a patch in this spacetime. 

We now note that (1+ 1)-dimensional Minkowski spacetime is a special case: as we can see in (16), d22_, 
degenerates for d = 2. Instead of starting out with (t, r) with 0 <r < 00, we have (t, x) with —oo < x < 00 
and thus 


ds* = —dt® + dx? = —dpdq ( dT? 4 dx?) 


~ 4cos2 P cos? O 
Now the coordinate X ranges between —z and 7, in contrast to the coordinate R, which ranges between 0 and 
in the higher dimensional cases. Instead of a triangle, spacetime now consists of a diamond shaped region, that 
is, a square rotated by 45°. (Another way of saying this is that spatial boundary S° is not connected and contains 
2 points. In contrast, S/—? is connected for d > 2.) 

For the statement that the maximal extension of Minkowski spacetime M4—"! has the topology of R x S¢—! 
to be applicable also for d = 2, we could identify the two points (T = 0, X =) and (T =0, X = —7). Then M1?! 
is maximally extended to R x S1, familiarly known as the cylinder. We will come back to this in chapter IX.11. 


Exercise 


1 Show that for the actual Schwarzschild solution, the science museums should use f(r) = 2,\/rs(r — rs). 


Notes 


1. For readers utterly unfamiliar with the story, I recommend chapters 20-22 in R. Freedman and W. Kauffman, 
The Universe. 

2. Kruskal was a distinguished plasma physicist. This episode in the history of physics reminds us that there 
is a huge gulf between “nonexperts” and crackpots. 

3. In hindsight, it might seem somewhat surprising that the Kruskal coordinates were found only in 1960. 
Allegedly, John Wheeler had to compel a reluctant Martin Kruskal to publish his work by writing it up for 


* Note the factors of 2. 
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him. M. Kruskal’s paper (Physical Review 119 (1960), p. 1743) carries a note stating “This work was reported 
in abbreviated form by J. A. Wheeler on behalf of the author,” suggesting that Wheeler did indeed force 
Kruskal to write it up. 

. In hindsight, it really is puzzling how the confusion over the nonexistent Schwarzschild singularity persisted 
for so long. Apparently, G. Lemaitre (recall chapter V.3; see also chapter VIII.1) had already shown in 1933 that 
the “singularity” could be removed by a coordinate transformation. But his paper, published in a little-read 
Belgian journal, was roundly ignored. Later, in 1950, J. L. Synge also clarified the nature of this nonsingularity. 
See A. Gsponer, arXiv:physics/0408100. I understand that in the former Soviet Union, I. Novikov had long 
understood that there is only a coordinate singularity at the horizon. 

. Perhaps more accurately, as a Carter-Penrose diagram. 

. S. Hawking and R. Penrose, The Nature of Space and Time, Princeton University Press, 1996, pp. 42 and 43. 
. For a nice pedagogical treatment of how the horizon appears as a black hole forms, see the not terribly well- 
known work by R. Adler, J. Bjorken, P. S. Chen, and J. S. Liu, “Simple Analytical Models of Gravitational 
Collapse,” American Journal of Physics 73 (2005), p. 1148. 

. See D. Marolf, arXiv:g1-qc/9806123 for more details. 

. As an undergraduate, I was a devotee of John Wheeler. In an article titled “John Wheeler’s mentorship: An 
enduring legacy,” Physics Today 62 (2009), p. 55, T. M. Christensen wrote, “Among the eminent physicists 
who were influenced as undergraduates by personal contact with Wheeler are James Hartle, David Sharp, 
Bruce Partridge, Anthony Zee, and Gary Horowitz.” See also my letter in the same volume. 
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Quantum fluctuations can set you free 


Nothing can get out of black holes. Spacetime is warped in such a way that, once inside the 
horizon, even light can never emerge, as the Kruskal diagram shows for a Schwarzschild 
black hole. We worked this picture out in detail in the preceding chapters. 

But as you have no doubt heard, that picture, painted exclusively with classical physics, 
no longer holds true when quantum effects are turned on, as we have already discussed in 
the introduction to this text. Black holes radiate as black bodies, each with a temperature 
characteristic of the specific black hole. Indeed, we were even able to determine, purely 
by dimensional analysis supplemented by a bit of basic knowledge about gravity, that the 
Hawking temperature for a Schwarzschild black hole is given by 


Ty~ 2. ~ fet (1) 


with M the mass of the black hole. You may wish to go back to the introduction to review 
how we did that. There we already noted that this simply derived result indicates that black 
hole radiation ends explosively. As M decreases, T goes up.! 

From (1), using the thermodynamic definition of entropy dM = TdS, we immediately 
determined the entropy to be 
GM 


S~ GM ~ 
he} 


(2) 


I have used dimensional analysis to restore fi in (1) and (2), thus showing clearly that 
for h = 0, we have T = 0 and S = ov, so that classically, black holes do not radiate. 

We now try to understand how quantum effects could change the picture so drastically. 
At the most handwaving and heuristic level, with quantum fluctuations, a photon can no 
longer be sure which side of the horizon it is on, and thus there is some chance it could 
get out. Let’s put substance on this basically correct explanation by learning about the 
phenomenon of the restless vacuum in quantum field theory. 
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The essence of quantum field theory in five minutes 


What is quantum field theory? Why is quantum field theory necessary? 

We need quantum field theory? when we confront simultaneously the two great physics 
innovations of the last century of the previous millennium: special relativity and quantum 
mechanics. Consider a rocket ship moving close to light speed. You need special relativity 
but not quantum mechanics to study its motion. In contrast, to study a slow electron 
scattering off of a proton, you must invoke quantum mechanics, but you don’t have to 
know a thing about special relativity. 

It is in the peculiar confluence of special relativity and quantum mechanics that a new 
set of phenomena arises: particles can be born and particles can die. It is this matter 
of birth, life, and death that requires the development of a new subject in physics, that 
of quantum field theory.* Let me explain presently how special relativity and quantum 
mechanics together can lead to dramatically novel physics. 

Consider empty spacetime. The vacuum, which we normally think of as vacuous, is (ac- 
cording to quantum field theory) rather astonishingly actually a boiling sea of fluctuating 
pairs of particles and antiparticles, containing, for example, pairs consisting of an electron 
and a positron (the electron’s antiparticle). 

Before special relativity came along, you couldn’t simply conjure up the mass of the 
electron and of the positron out of the vacuum. But with Einstein’s gold-plated equation 
E ~ mc’, you could if you have enough energy. 

However, without quantum mechanics, the process is still forbidden by energy conser- 
vation. Where would the necessary energy come from? 

The gold-plated equation of quantum mechanics, Heisenberg’s uncertainty principle, 
At ~ h/AE, comes to the rescue. When Nature balances her accounts, she can tolerate 
briefly a certain amount of fuzziness. 

Thus, in a world with both special relativity and quantum mechanics, an electron and 
a positron can pop out of the vacuum, but only for a characteristic time of at most’ order 
At ~ h/(m,c?), a very small time interval by everyday standards, after which the electron 
and the positron must annihilate each other and disappear back into the vacuum. Quantum 
electrodynamics was invented partly to deal with this sort of vacuum fluctuation. 

This heuristic discussion indicates that the fluctuations are universal and involve all 
particle species, including the photon and the graviton (which happen to be their own 
antiparticles). For massless particles such as the photon and the graviton, the denominator 


in the estimate for Ar should be replaced by their characteristic energies.’ 


* I remark in passing that you have already seen repeatedly, in chapters II.3, IV.1-IV.3, VI.4, and VI.5 
for example, another need for quantum field theory when we write down actions describing the interaction 
between particles and the electromagnetic and gravitational fields. There was always an unbearable and unsightly 
dichotomy between point particles on one hand, and spacetime-pervading fields on the other hand. It would be 
intellectually more satisfying to treat all the elementary particles, the electron and all the rest, on the same footing 
as fields. 

¥ Since the minimum energy the electron and the positron can have is of order me. 
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This boiling vacuum with all its agitation is irrelevant for a large chunk of physics. The 
reason is that the time scale At over which a fluctuation occurs is much shorter than 
the typical time scales explored in many areas of physics. In a collision between particles, 
however, the fluctuating pair can borrow the required energy from the colliding particles 
and thus evade the Heisenberg bound on At. The electron and positron pair does not 
have to annihilate each other but can escape to infinity. Thus, we can scatter an electron at 
high energy off a proton and produce an electron-positron pair, a process known as pair 
production in quantum field theory. 

In contrast, write down the Schrédinger equation for an electron scattering off a proton. 
The equation describes the wave function of one electron, and no matter how you shake and 
bake the mathematics of the partial differential equation, you will always have one and only 
one electron. The Schrodinger equation is simply incapable of describing pair production. 
Nonrelativistic quantum mechanics must break down under these circumstances. 

Not quantum field theory in five minutes, of course, but the essence of quantum field 
theory in five minutes! 


Vacuum fluctuations near a black hole 


Pairs of particles and antiparticles pop out of the vacuum for an instant and then van- 
ish. These incessant but ephemeral fluctuations were largely of interest only to particle 
physicists until Hawking came along.* 

But what if the fluctuations occur near the horizon of a black hole? 

As we have learned, Riemann curvature near the horizon scales like ~ 1/r3. « 1/(GM)’, 
and spacetime can be almost flat for M large. Unlike particle physics, Einstein gravity is 
not normally concerned with high energies. But it’s not the curvature, rather the causal 
structure, that matters! 

At the horizon r = 2GM, the coefficients of dt? and dr? change sign, indicating that 
time and space, and hence energy and momentum, are interchanged. A pair pops out 
near the horizon. During the short time the pair can exist, one of them, say the antipar- 
ticle to be definite, could fall through the horizon, at which point its energy becomes a 
momentum component! The particle, liberated from the constraints of energy conserva- 
tion and Heisenberg’s principle, can now exist forever and escape to infinity, where it tries 
to live happily ever after without its partner. 

In particle physics, colliding particles supply the energy needed to balance the books. 
In Einstein gravity, while Nature compulsively balances her energy budget, we fool her by 
dumping one of the particles of the pair down a black hole. The Heisenberg restriction on 
At is evaded by changing what we mean by energy as the particle crosses the horizon. 

In a Kruskal diagram, you can easily depict this process, showing the antiparticle falling 
to its doom at r = 0 and the particle escaping to Z*. To balance the energy momentum 
budget, the black hole would have to lose a bit of mass and recoil a little. For a black hole 
with mass M much greater than the typical energy of the escaping particle, these effects are 
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negligible. These fluctuations occur ceaselessly around the horizon, and thus we conclude 
that the black hole radiates universally: all particles are involved. 

A detailed quantum field theoretic calculation* should reveal to us the energy distribu- 
tion of the radiated particles, which is precisely what Hawking did. 

But even without doing the calculation, we can anticipate, if we are willing to play fast 
and loose,' what the distribution has to be: the only universal energy distribution known 
to physics is the Boltzmann distribution, and thus we expect that the probability for the 
radiated particle to have energy E is proportional to e~*/74 for some temperature Ty 
characteristic of the black hole. 

The kind of discussion given here is clearly meant to be heuristic. One caveat: a photon in 
the Hawking radiation has characteristic energy w ~ Ty; ~~ (GM)~' and thus a wavelength 
A’ ~ GM comparable to the size of the black hole. The very concept of a particle may be a 
bit dicey. 

Although many people believe that Hawking radiation provides a crucial clue to the 
eventual understanding of quantum gravity, it is worth emphasizing that the calculation 
leading to Hawking radiation does not involve quantum gravity as such. The role of 
the gravitational field is to provide a spacetime with a peculiar causal structure for the 
other fields to do their quantum fluctuating in. The Schwarzschild solution is still treated 
classically. 

An important clue is provided by the black hole information paradox, first articulated 
by Hawking. Put at the most elementary level, the question is: what happened to the 
information contained in the material that fell in to form a black hole? Eventually, we end 
up with thermal radiation, which, according to standard considerations, does not contain 
any information at all. The paradox may be sharpened as follows. Consider an initial 
distribution of matter described by a pure state in quantum mechanics, which collapses to 
form a black hole. After we wait long enough, this evolves into a thermal state described by 
a density matrix in quantum mechanics. But quantum mechanical evolution is governed 
by a unitary operator, which cannot possibly turn a pure state into a thermal state. Thus, 
there appears to be a basic contradiction with quantum mechanics, hence a paradox. This 


* At least with the benefit of hindsight, the calculation is not as difficult as you might think. Consider a black 
hole radiating electrons and positrons, to be specific. We are not interested in the interaction of the electron 
and positron with the electromagnetic field and with each other. In other words, we don’t need a full blown 
mastery of quantum electrodynamics, but instead, we can treat the electron field as a free field propagating in 
the Schwarzschild metric, that is, free except for the influence of gravity. Nevertheless, I choose not to do the 
calculation here, as it involves a number of concepts from quantum field theory. Instead, I give in appendix 1 a 
slick derivation of T;, which may well turn out to be more profound than the actual nitty-gritty calculation. For 
those who want to go through an actual calculation, a good place to start is the paper by W. G. Unruh. 

+ Objections to this kind of handwaving argument come readily to mind. Why are things in thermal equilib- 
rium? The Boltzmann distribution presupposes some kind of heat bath. Where is it? 

+ In the introduction in part 0, in discussing the cube of physics, I associated the corner with G £0, i #0, 
and c 4 0 with quantum gravity. I mentioned, in a footnote, a slight caveat to this statement. Here it is. While 
G, hi, and c all appear in (1) and (2), the calculation leading to them was done without quantizing gravity. Even 
if we were to include the radiation of gravitons, the gravitons could be treated as small fluctuations superposed 
on the classical gravitational field. 
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subject has a long and controversial history that cannot possibly be covered here; you are 
invited to trace back this history starting with the recent literature.” One possibility is that 
the naive view that, in a region in which the Riemann curvature tensor could be made 
arbitrarily small, physics would be indistinguishable from flat spacetime, may be wrong. 
The limit may turn out not to be smooth. 


A semi-quantitative argument for Hawking radiation 


Our friend the Smart Experimentalist suddenly speaks up, “Without knowing quantum 
field theory, we should still be able to make the heuristic argument about the infalling 
particle semi-quantitative. Think of an experimentalist at rest close to the horizon at 
r=rg +a. She observes in her lab a particle-antiparticle pair popping out. The entire 
lab falls freely and crosses the horizon. The horizon is not marked by a line or anything; 
inside the lab is merely almost-flat, almost-empty spacetime.” 

Confusio catches on enthusiastically. “The smaller a, the sooner we cross the horizon, 
the shorter is Ar, and, so according to Heisenberg, the escaping particle could have a 
higher energy. Oops, it seems that the characteristic energy of the particle detected at 
infinity increases as a decreases.” 

SE smiles, “Confusio, you forgot the gravitational redshift! Recall that energy is red- 
shifted down by a factor given by the square root of gop evaluated at r = (rg + a), which 
almost by definition vanishes as we approach the horizon, as a > 0.” 

Confusio is delighted. “Let’s hope that the two effects cancel out.”* 

I say to both of them, “We will let the attentive reader find out if the a dependence does 
indeed cancel out.” Challenge yourself. See exercise 1. 


“The consequences of my crime echo down to the end of time” 


One afternoon in 1970, . . . | told [Bekenstein] of the concern | 
always feel when a hot cup of tea exchanges heat energy with a 
cold cup of tea. By allowing that transfer of heat .. . | increase 
[the universe’s] microscopic disorder, its information loss, its 
entropy. “The consequences of my crime, Jacob, echo down 
to the end of time,” | noted. “But if a black hole swims by, 
and | drop the teacups into it, | conceal from all the world the 
evidence of my crime. How remarkable!” Bekenstein, a man of 
deep integrity, takes the lawfulness of creation as a matter of 
the utmost seriousness. Several months later he came back with 
a remarkable idea. “You don’t destroy entropy when you drop 
those teacups into the black hole. The black hole already has 
entropy and you only increase it!” 


—John Archibald Wheeler® 


* This indicates that an observed photon in the Hawking radiation may have originated near the horizon with 
trans-Planckian energy—a fact that you may or may not find disturbing. 
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As this story told by Wheeler indicates, his student Jacob Bekenstein was the first 
to recognize that black holes have entropy. In fact, as (2) shows, in classical general 
relativity, not only does the Schwarzschild black hole have entropy, it also actually has 
an infinite amount of entropy. This makes sense, since entropy is the logarithm of the 
number? of microstates!® that correspond to a single equilibrium macrostate, and we can 
make a Schwarzschild black hole of a given mass M in an infinite number of ways, by 
throwing any amount and any variation of stuff into it, provided that the total mass adds 
up to M. 

For something as fundamental as the entropy of black holes, we politely decline to use 
ludicrous units, such as joules per degree centigrade, and so once again in this chapter, we 
go back to the introduction to this text and recall the other profound concept mentioned 
there, namely Planck’s insight into measurement. Recall that we have three fundamental 


units to do physics with: the Planck mass Mp = fe the Planck length /p = ,/ i, and the 


Planck time tp = ree . By now we understand well how length and time can be measured 
with the same unit, so set c = 1 and write 

G= tp a tp = zee (3) 

hh M2 
Note also that Mplp = h. 

In the introduction, we already derived in these natural units the entropy of a Schwarz- 
schild black hole: § ~ GM?/h ~ R?/hG ~ A/l%, with the surface area the black hole 
A~ R?~rg~ (GM)”. Lalso told you there that you should be shocked, shocked, shocked. 
The entropy of a physical system is normally extensive* and proportional to its volume. It 
is as if the entropy of a black hole were to reside completely on its surface. Indeed, imagine 
laying down a grid on the surface of a black hole. Somehow, each Planck-sized cell contains 
one unit of entropy. This mysterious property of black holes, which represents one of the 
deepest puzzles in theoretical physics, led ’t Hooft and Susskind separately to formulate 


the so-called holographic principle (see chapter IX.11). 
h 3 
SxGM 


ature. Given this, we can use elementary physics to write down a precise expression for 
the entropy: d(Mc?) = TydS = hc?dS/8x GM, which implies that S = 427 GM?/hc. Using 
A= 4mrg and rs = 2GM/c’, we obtain 


In appendix 1, we derive the precise expression Ty; = 


for the Hawking temper- 


cone 4 
Al 4) 
(You could of course absorb the factor of 4 into the definition of /p if you want.) 
When we take the classical limit by letting i — 0, we hold G, not Mp, fixed. Indeed, Mp 
is not a concept in classical physics. In the classical treatment of black holes, the entropy 


S x h71 is formally infinite (as 1 have mentioned twice already), since the black hole can 


* This is proved for systems with short-ranged interactions. 
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be made in an unlimited number of ways. For dM = TdS to be satisfied, it is consistent 
to set T = 0, which means the black hole does not radiate. 

This suggests another handwaving (be warned!) argument for Hawking radiation, due 
to Gibbons and Hawking. For the entropy S of a black hole to be finite, quantum physics 
must somehow limit the number of ways a black hole can be made. Let us focus on the 
difference in entropy between two black holes of mass M and M + dM. Consider the 
relation dM = TdS: if dS is infinite, T would be 0, and we would have no radiation. As you 
know, an elementary fact of quantum physics is that the size of a particle is characterized 
by its de Broglie wavelength. A particle whose wavelength is much smaller than the 
Schwarzschild radius rg can be regarded as a point particle and would fall in (depending on 
its velocity and impact parameter, and so forth), but a particle whose wavelength is larger 
than rs could simply pass the black hole by. Thus, a particle whose wavelength is larger 
than GM but smaller than G(M + dM) is less likely to fall into the smaller black hole. 
We thus argue that dS is actually finite when quantum mechanics is turned on. Once you 
admit that dS is not infinite, then the relation dM = TdS no longer forces T to vanish, 
and once you admit that T 4 0, we can then run our dimensional analysis argument. 

The entropy of a black hole is finite, and so Wheeler was not able to violate the second 
law of thermodynamics by throwing cups of tea into a passing black hole. As Bekenstein 
explained to him and to the rest of us, he had merely increased the entropy of the black 
hole. If Wheeler were right, we could all help to decrease the disorder in the universe by 
simply dumping our mess into passing black holes. 


t Hooft’s bound 


The mass and surface area of a Schwarzschild black hole are closely related. A rotating 
black hole, however, has another dimensionful parameter, the angular momentum J, so 
that its surface area A = A(M, J) is a function of its mass M and angular momentum J, 
as we will see in chapter VII.5. Classically, the mass of a Schwarzschild black hole always 
increases, and so by free association, one might be tempted to think, as people did around 
the time of Bekenstein’s insight, that it is related to another quantity in physics that always 
increases, namely the entropy. For a rotating black hole, however, as Penrose discovered 
in 1969 and as we will explain in chapter VII.5, we can decrease M by a physical process. 
But remarkably, the decrease in M is always accompanied by a decrease in J in precisely 
such a way so that A always increases. This indicates that we should associate the entropy 
of a black hole with its surface area. 

Let’s go back to the Schwarzschild black hole. Imagine letting two black holes with 
masses M, and M, slowly coalesce into a single black hole with mass M, + M), neglecting!! 
the energy carried away by gravitational waves. Indeed, the surface area always increases: 
(M, + M))? > M? + M3. 

Since the Planck area / is so ludicrously small (numerically, ~ 2.6 x 10~® cm’), any 
macroscopic black hole has an enormous entropy, which as you might expect, greatly 
exceeds the entropy of other physical systems. 
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For comparison, consider a box of volume V filled with relativistic matter, for example 
photons, characterized by a temperature T. By relativistic matter, we mean matter con- 
sisting of particles whose masses are negligible compared to their energies. The entropy 
and energy of a photon gas are worked out in textbooks on statistical mechanics, but for 
our purposes, we can simply use dimensional analysis. In natural units, temperature T 
has dimensions of energy or inverse length. In contrast, entropy S is dimensionless and 
proportional to the volume of the box V ~ L?, with L the characteristic size of the system. 
So, the entropy can only be § ~ L?7?. Similarly, the energy density ¢ has dimensions of 
mass over length cubed, or mass to the fourth power, and thus by dimensional analysis 
e ~ T*, leading toa total energy E ~ L?T*. As you will see presently, the overall numerical 
factors here do not matter at all. 

An almost universally accepted (but not yet mathematically proven) folk belief is that if 
a physical system has a Schwarzschild radius rs ~ M larger than its size L, it will collapse 
into a black hole. (The obesity index in the introduction!) Now consider a box of photons 
so hot that, if we throw in just a bit more energy, the box will collapse and become a black 
hole. The condition of being on the verge translates into E ~ L3T* X L, that is T $1/L 2 
The entropy of the box is thus 


S~DT3SLi~ Ai (5) 


A box of electromagnetic radiation hot enough to be almost a black hole has an entropy 
that can grow at most like the 3 power of area, rather than like the area, as is the case for 
a black hole. Remember that we are using the Planck area to measure area with. Thus, for 
A > [2, this entropy is tiny compared to that of a black hole. This bound was obtained by 
’t Hooft in 1993. 


When do we need quantum gravity? 


It is important to emphasize that in deriving Hawking radiation, we don’t have to quantize 
the gravitational field. What we have to quantize is the particle being emitted: it and its 
antipartner are the ones that are quantum fluctuating out of the vacuum. Gravity’s task is 
to change the causal structure of spacetime, and Einstein’s classical theory is entirely up 
to the job. No quantum gravity is needed.! 

This may be an appropriate occasion to give a handwaving argument! regarding when 
we have to worry about the quantum nature of a field. Consider an object of mass M, for 
example, you. As you walk around, you are surrounded by a gravitational field that in reality 
consists of a swarm of gravitons. Let’s estimate N, the number of quanta in the swarm. If 
the number of quanta in the field is of order 1, then we would certainly have to deal with 
the quantum nature of the field. But if N >> 1, then the field can be treated classically. To 
estimate N, let the object be spherical,!* and imagine the swarm of gravitons spread out 
in a spherical distribution with a characteristic size L. By the uncertainty principle, the 
characteristic energy of a graviton is then of order ¢ ~ ii/L. The total energy contained 
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in the gravitational potential ¢ = —GM/r is, according to the Newton-Einstein-Hilbert 
action, given by 


E~G? / ax (Vo ~ Gu"! i dr r? (GM/r?)? ~ GM2/L 


Thus, the number of quanta equals N ~ (GM?/L)/(h/L) ~ GM?/h ~ (M/Mp)?, where I 
used (3) in the last step. 

This is a pleasing result and presumably accords with your intuition: unless the mass 
M is comparable to the Planck mass Mp, you don’t need to lose any sleep over quantum 
gravity at all. You certainly did not expect that the field surrounding you could not be 
treated classically, did you? This heuristic argument applies to all masses, including black 
holes. Thus, you only have to worry when the mass of the black hole drops to ~Mp as it 
approaches its explosive end. 

The origin of the Bekenstein-Hawking entropy (4) poses a deep mystery. As already 
mentioned, and as you know from a course on statistical physics, entropy measures the 
number of microstates that correspond to a given macrostate. But no amount of staring at 
the Schwarzschild metric, which is just a solution of some coupled differential equations, 
is going to let you count the microstates. To address this mystery, a theory of quantum 
gravity is no doubt needed. Indeed, one triumph of string theory as a candidate theory for 
quantum gravity is to provide this counting. This was accomplished by Strominger and 
Vafa in 1996 for a class of 5-dimensional extremal* black holes in string theory. The rea- 
soning! is highly technical and involves, for example, concepts such as supersymmetry. 
Roughly speaking, the strategy involves adiabatically lowering the gravitational constant in 
a thought experiment to the point when the black hole dissociates into a bunch of objects 
specific to string theory known as D-branes, whose degrees of freedom one can count using 
highly nontrivial techniques. Remarkably, the counting yields precisely the area-entropy 
relation (4). Since then, much progress had been made, and now people understand what 
is going on in (3 + 1)-dimensional spacetime, including some cases without supersym- 
metry.!” At present, a straightforward accounting of the entropy of a plain Schwarzschild 
black hole has not yet been accomplished. 

I should warn you that the three appendices to this chapter are exceptionally demanding. 
Some minimal knowledge of quantum and statistical mechanics is required to read these 
appendices. Those readers who have never heard of quantum mechanics may wish to skip 
the first two appendices, or at least read them with the appropriate attitude to get merely 
a flavor of these more advanced topics. 


Appendix 1: Determining the Hawking temperature 


As I’ve long promised, ever since the introduction, we will now calculate the Hawking temperature Ty, including 
all the factors of 2 and z. These factors are not essential for our understanding of Einstein gravity, but we 


* The term “extremal” will be explained in chapter VII.6. 
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physicists, in contrast to talkers, do have to subscribe to the Feynman “shut up and calculate” dictum, at least 
occasionally. 

First, I have to tell you about a mysterious correspondence between quantum statistical mechanics and 
quantum field theory. You have probably learned that in Heisenberg’s formulation of quantum mechanics, the 
evolution of a quantum state after time T is governed by the evolution operator e~'#7, with H the Hamiltonian. 
The probability amplitude for an initial state |/) to end up in the final state | F) is then given by 


Z=(Fle!4 \7) (6) 


Much of the work in quantum mechanics and in quantum field theory involves massaging (or beating) this 
quantity into a form we can work with. For example, in the Dirac-Feynman path integral formulation,!* one 
follows Newton and Leibniz by breaking up (F|e~!”7 |/) into infinitesimal factors and then expressing the 
resulting product as an integral over all possible paths the classical system could have followed going from the 
initial to the final state. 

However, Boltzmann taught us that, at temperature T, the relative probability of a state |n) of energy E,, 
occurring is given by e~®:/7 = e~P Fn, where 6 = 1/T, as usual. (You should not confuse the temperature T with 
the time T in the preceding paragraph of course: the same letter for two different concepts in two different areas 
of physics! The introduction of the inverse temperature f helps in this context; we won’t return to temperature 
until later.) We define the partition function of a quantum mechanical system with the Hamiltonian H by 


Z= x e FF In) = werk =TreP4 (7) 


The sum over states is represented by a trace, with e~?” regarded as a matrix. As is probably well known to you, 
various physical qualities, such as the expected value of energy E = )~,,E,e~°""/Z, can be extracted from the 
partition function Z. 

Evidently, there is a potentially profound correspondence between the two fundamental equations (6) and 
(7). To go from (6) to (7), we simply replace the time T by —if, set |/) = |F) = |n), and sum over |n). What a 
mysterious procedure! First, we make time imaginary. Then we force every state |n) to go back to itself. But how 
can we make sure that every quantum state does this? We can if time is somehow cyclic, so that what is past is 
the future. I have no idea what that means. The inverse temperature f is equal to the recurrence period in this 
strange world with imaginary time. 

Well, we don’t have to understand what any of this means, but we can certainly regard this as a devilishly nifty 
computational trick. Consider a quantum field, be it the field of a photon, an electron, or whatever, propagating 
in spacetime. Suppose it discovers that time is actually imaginary and cyclic. The field is fooled into thinking that 
it is living in a temperature bath, to use a term from statistical mechanics, with the temperature determined by 
the inverse of the recurrence period £ of this bizarre imaginary time. 

Amazingly, we can now use this strange observation to determine the temperature of the Hawking radiation 
from a Schwarzschild black hole. Consider the electromagnetic field, for instance, governed by the action 


S= f dtx/=e(—farrgr? Fy oq), propagating in the Schwarzschild spacetime described by 


-1 
ds? = (1 ‘s) dt? 4 (1 ‘s) dr? + r2d0? +r? sin? dg? (8) 
r i 


with rs = 2GM. Near the horizon, ds? ~ =a dt? 4 a dr? + r2dQ. Change variables from r to p given by 
p* =4rs(r — rs). Then pdp = 2rsdr, so that p2dp? = 4redr? or (r —rs)dp? = rsdr?. Plugging this into ds, we 
find that spacetime near the horizon is described by 


2 2 
Pi 42 24 4.27092 py 2 24 2702 
ds? = dt? + dp? + redQ? > S.dt2 + dp* + r2dQ 
4r2 2 4r2 EO << 
where in the last step, we set* t = —itg as per the mysterious procedure outlined above. If we now change variable, 


setting tg = 2rgy, we obtain 
ds? = dp* + p’dw? + rgdQ’ (9) 


We recognize that the first two terms describe a plane with polar radius p and polar angle y. The (3+ 1)- 
dimensional spacetime has been analytically continued into a 4-dimensional Euclidean space consisting of a 


* The subscript E stems from the terminology used in quantum field theory; upon time becoming imaginary, 
(3 + 1)-dimensional Minkowskian spacetime morphs into 4-dimensional Euclidean space. 
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plane, at every point of which is attached a sphere of radius rs. More importantly, since w is an angular variable, 
we see that the “imaginary time” tg = 2rgw has a recurrence period of 2rg(27) = 4g. Thus, according to the 
preceding discussion, the electromagnetic field propagating near the horizon of the Schwarzschild black hole 
thinks that it is living in a heat bath with temperature 


i SS AM hs 
H4nrs 8xGM  81GM 


(10) 


This is the Hawking temperature* of the black hole! 

In Wigner’s influential essay “The Unreasonable Effectiveness of Mathematics in Physics,” he told a story 
about two men sitting next to each other on a plane.!? One asked the other, “What do you do for a living?” The 
man answered, “I work for the insurance company and I use math to predict how long people will live.” The first 
man said, “You are pulling my leg; I don’t believe that you can do that.” So the second man pulled out a report 
on which was written the Gaussian distribution. The first man pointed to the letter 7, saying “But isn’t that the 
ratio of the circumference of a circle to its diameter?” “Exactly.” The first man then exclaimed with a touch of 
displeasure, “Now I know you were fooling around with me. What does the circle has to do with how long a man 
will live?” In an updated version of this story, I imagine you answering, “I am a theoretical physicist and I figure 
out how hot black holes are.” As your flight companion expresses a mixture of admiration and disbelief, you 
show him (10). After he points to the z in the equation, you can tell him that it comes in because time moves in 
a circle! 


Appendix 2: The Unruh effect 


A dutiful reader with a good memory might have recognized that the form ds* ~ — edt? + dp” + rgdQ? of 
Ss 


the near-horizon Schwarzschild metric described in the preceding appendix looks like the Rindler metric ds? = 
—p*dT? + dp* + p* cosh” TdQ? worked out as an exercise back in chapter III.3. After appropriate rescaling, 
the near-horizon Schwarzschild metric and the Rindler metric are in fact the same for fixed 6, y (that is, 
for dQ? = 0). Recall that we obtained the Rindler metric by changing the standard coordinates (t, r, 0, g) for 
Minkowski spacetime to the Rindler coordinates (T, p, 0, g) by t = p sinh T, r = p cosh T and then plugging 
these transformations into the Minkowski metric ds? = —dt? + dr? + r2dQ?. It is important to note that the 
coordinate transformations just given have the ranges —oo < T < oo and 0 < p < ov. Thus, the new coordinates 
only cover the quadrant defined by r > |r|, as shown in figure III.3.6. 

For fixed @ and 9, the lines of constant p trace out hyperbolas in the (t, r) plane as T varies from —oo to 
oo. It can now be revealed that these hyperbolas, as you may have already seen, are in fact the worldlines of 
accelerating observers in Minkowski spacetime. Suppressing 6 and and writing g° = p sinh T, g' = p cosh T 
for the spacetime location of an observer labeled by the parameter p, we have for the proper time of the observer 
dt =,/(dq®)* — (dq')? = pdT. Thus, v4 = age = da" |p = (cosh T, sinh T). (As expected, y,,,v“v"” = —1, as per 
the definition of proper time.) The acceleration is then a” = de = p~\(sinh T, cosh T), where I display only the 


2 nonzero components of the vector v. Hence n,,,a"a” = p~*, the Lorentz invariant measure of the acceleration 
squared, is a constant independent of T. Note that, for a given p, the minimum value of q' is p, attained when 
T = 0. It makes sense that the most highly accelerated observers, namely those with the smallest p, manage to 
get the closest to the surface r = |t|, which defines the light cone centered at the origin. 

With these preliminaries, we are now ready for the point of this appendix. Bill Unruh”° discovered that, as a 
result of quantum fluctuations, an accelerated observer in Minkowski spacetime would perceive a bath of thermal 
radiation. As we will see, this so-called Unruh effect is closely related to the Hawking effect. Just like the Hawking 
effect, a proper derivation of the Unruh effect requires some knowledge of quantum field theory, which as I said, 
I do not presume the typical reader of this book to have. Instead, let me give a handwaving argument.”! 

Let our accelerated observer carry a detector designed to detect quantum fluctuations in, say, the electromag- 
netic field. The detector might consist of a quantum mechanical system with energy levels E;,i =0,1,2,---. 
Every time the electromagnetic field causes a transition from some level i to level j, the detector will beep. Now 


* Surely you would hit it big with mystical types if you tell them that temperature is equivalent to cyclic 
imaginary time. At the arithmetic level, this connection merely comes from the fact that the central objects 
in quantum physics e~'”7 and in thermal physics e~?” are formally related by analytic continuation. Some 
physicists, including me, feel that there may be something profound here that we have not quite understood. 
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if the detector is being carried by a uniformly moving observer, and by Lorentz invariance it might as well be 
sitting at rest, we know that nothing will happen. The reason is that a fluctuation that causes a transition from 
i to j would be quickly followed, in a time of order h/|E; — E| (which we assume to be much shorter than 
the reaction time of the detector), by a counterfluctuation that would cause a transition from j back to i. (Some 
readers might know that, in quantum field theory, fluctuations at different points in spacetime are correlated, as 
quantified by the 2-point Green’s function of that field.) But if the detector is being accelerated, then by the time 
the counterfluctuation comes along, it would be moving at a different velocity from before, that is, its rest frame 
would differ from what it was before. The electromagnetic field E (t, X) and Bit, xX), when Lorentz transformed 
to the new frame, would not be quite right to cause the transition from j back to i. As a result of this mismatch, 
the detector would indicate the presence of a bath of radiation. (What? You're not convinced? Well, I did tell you 
that the argument was going to involve hand waving.) 

To me, a far more convincing heuristic argument is the essential equality between the near-horizon Schwarz- 
schild metric and the Rindler metric, as indicated above. A quantum field only knows about the environment 
it finds itself in through its knowledge of the metric. How does the detector “know” that it is being accelerated 
rather than cruising near the horizon of a black hole? Thus, the Hawking effect and the Unruh effect are both 
likely to be true (or, far less likely, to both be false). 

A crucial feature common to both effects is the presence of a horizon. As you can see from figure III.3.6, 
nothing from the region r <t for ¢ positive could reach, even if it were traveling at the speed of light, the 
accelerated observer. Thus, the surface defined by r = t, which we previously identified as the forward light 
cone centered at the origin, effectively acts as a horizon. Indeed, figure III.3.6 resembles the Kruskal-Szekeres 
diagram for the spacetime around a Schwarzschild black hole. Hence, we can invoke the argument given in this 
chapter in support of Hawking radiation: quantum fluctuation produces a particle and an antiparticle, and before 
they can come together, one of them goes beyond the horizon, leaving the other free to escape. These escaping 
particles would constitute the Unruh radiation. 

The temperature of the Unruh radiation can be estimated to be 


Ty~ar~ — (11) 


by dimensional analysis, since the only quantity with the dimension of energy, or equivalently an inverse length, 
is the magnitude of the acceleration a ~ (n,,,a"a") z, 

Next, I will sketch, using broad brushstrokes, the serious derivation first presented by Unruh. Readers are 
forewarned that this will require some knowledge of the quantum world, and those without this knowledge are 
urged to skip the rest of this appendix. 

In quantum mechanics, the position operator q(t) ofa harmonic oscillator is expressed in terms of annihilation 
and creation operators a and a’ as follows: g(t) ~ ae!" + ate!" with E > 0 the characteristic energy of the 
oscillator. The ground state of the harmonic oscillator, denoted by |0) (in a notation already used in this chapter), is 
annihilated by the annihilation operator a in the sense that a |0) = 0. We generate the excited states |n) ~ (a)"| |0) 
by repeatedly acting with the creation operator at on the ground state. (Hence it is actually more accurate, in the 
context of quantum mechanics, to speak of a and a’ as lowering and raising operators.) How do we know, of 
a and at, which one is the annihilation operator and which the creation operator? The answer goes back to the 
fundamental requirement that the creation operator is to create a state with positive energy. Thus, a is always 
associated with the positive energy wave function e~'”' and a? with e*". 

In quantum field theory, these notions are generalized in a straightforward fashion. The generic field (1, x), 
namely the analog of g(t), depends on space as well as time, and so the positive energy wave function is 
generalized to e i(Et-k-) | that is, a wave in space and time, characterized by energy E and momentum k. 
Correspondingly, a(k) and a‘ (k) now depend on k, and an integral over k is required. Thus, we end up writing 


oD~ fak(ae we + atiele*) 


All this is baby quantum field theory and is explained in any book”? on the subject. Bottom line: the field (t, x) 
is a linear combination of a(k) and a‘ (k), associated with e~!£" ‘Et respectively. In fact, a quantum field 
can be thought of as a collection of harmonic oscillators. An important conceptual difference between quantum 
field theory and quantum mechanics is that the ground state |0) is now more properly called the vacuum state, 
a state in which no particle is present and the field is quiescent. Acting with a? (kK) on |0) produces a state with a 
particle carrying momentum k: the operator a ik (k) i is said to create a particle out of the vacuum. 


and e 


For our purpose here, let us write, more schematically, 6 ~ )°y(da fy + ag f*). (Here, * denotes complex 
conjugation.) The important point to take away is merely that the quantum field ¢ can be written as a linear sum 


of annihilation and creation operators a, and aj capable of annihilating and creating particles. The subscript a 
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labels the properties (such as momentum) the particles may carry. The corresponding wave functions are written 
as f, and f*, and the integral over kis replaced by a sum over a. The vacuum state |0,) is defined as the state 
annihilated by ay: dy |0,4) = 0. 

Relativity brings a new twist to this framework for studying quantum fields: different observers can disagree 
politely over what they regard as time. For the case at hand, while an observer sitting at rest in Minkowski 
spacetime uses t for his time coordinate, the accelerated observer would insist on using T for her time coordinate. 
More generally, another observer could use as wave functions gg and 8p» instead of f, and f*, and decompose 


the quantum field as ¢ ~ )'g(bggp + bh 8p) She would use bg and bh as her annihilation and creation operators, 
and define her vacuum state |0,) as the state annihilated by b g: bg |0z) =0. 
This discussion shows that the concept of particles, and even that of the vacuum, depends on the observer. 
In general, the wave functions gg and 8 can be written as linear combinations of f, and f**. Since ¢ is the 


same old ¢ regardless of observer, comparison of the two expressions for ¢ implies that a, and ag can be written 
as linear combinations of b 2 and bi, and vice versa. This relationship between the two sets of annihilation and 
creation operators is known as a Bogoliubov transformation. 

We are getting close to the punchline! Suppose observer A says, “We are in the state |0,), and there aren’t 
any particles around.” Observer B would disagree. To her, since a, is given as a linear combination of bp and bh, 
schematically, a, ~ )°(Uagbg + Vp h)s the condition a, |04) = 0 amounts to, schematically, }°, Usgbg 104) ~ 
Ys Visbh |0,). In other words, bg ||0,), far from being 0, is actually related to a linear combination of bh |O,4)- 
Observer B would say that the state |0,) contains particles as defined by her. This may appear as a long winded 
way of saying that |0,4) is not equal to |0,), but it goes beyond that by indicating that the number of b-type 
particles contained in |0,4) can be calculated in terms of the coefficients in the Bogoliubov transformation. 

Now we apply this to the situation at hand. An observer sitting at rest in Minkowski spacetime can insist that 
no particle is present, and yet the accelerated observer will see a bath of particles. In other words, the Unruh 
effect! 

I hope that I have given you a flavor of the derivation and prepared you to read Unruh’s paper. For those 
readers who know that the wave functions of a nonrelativistic single particle in a 1-dimensional box of length 
L are given by, for n = 1, 2,---, W(x) =sin(nwx/L) for 0 <x < L, and y,(x) = 0 otherwise, I can offer a toy 
example that may or may not help. 

Suppose our nonrelativistic particle is sitting in the ground state (x). The probability of finding the particle 
in an excited state w,,.;(x) is strictly zero. Now suppose the box is suddenly expanded to twice its former size. 
The wave functions are now given by, for n = 1, 2,---, Y,(x) = sin(nmx/(2L)) for 0 < x < 2L, and W,(x) =0 
otherwise. Note that W,, (x) is not the same as wy, (x). The initial wave function w(x) can be expressed, according 
to Fourier, as a linear combination of the new wave functions W,,(x), namely W(x) = >>), Cy UV, (x). The probability 
of finding the particle in an excited state with n > 1 is now nonvanishing, « |c,,|*. The sudden expansion of the 
box has excited the particle. Note that in nonrelativistic quantum mechanics, if we start with a single particle, we 
end with a single particle, albeit in an excited state. 

When we go to quantum field theory, the role of the particle is played by a quantum field, and the particle 
jumping into an excited state gets translated into the quantum field becoming excited and hence capable of 
creating particles. (Do not confuse the notion of particles in nonrelativistic quantum mechanics with the notion 
of particles in quantum field theory, which correspond to “excitations” in the quantum field. When excited, a 
nonrelativistic particle jumps to a higher energy level; when excited, a quantum field creates particles.) I hope 
that this toy example of an expanding box does not confuse you too much and that it conveys to you the possibility 
that an expanding universe is able to create particles and antiparticles. 


Appendix 3: Thermodynamics of spacetime and Einstein’s field equation 


One intriguing, and possibly fruitful, approach, proposed by Jacobson,”’ is to regard the entropy formula (4) as 
fundamental and to derive Einstein’s field equation from it. To see how this is possible, consider an ideal gas ina 
container of volume V and total energy E. Given S(E, V), thermodynamics teaches us how to find the equation 


of state. In general, dE = TdS — PdV, ordS = T~\(dE + PdV). In other words, + = (ee and f = Gar 


dE av 
For an ideal gas, the entropy is given by the logarithm of the number of possible states. Since each molecule can 
roam over the volume V, the number of accessible states is proportional to V,, and so we argue that the entropy 
goes like S = N log V + f(E), with some function f(£). The second of the thermodynamic relations just given 
then yields £ = X, that is, the well-known equation of state PV = NT. 
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At the level of this book, I can only sketch Jacobson’s argument in the crudest possible terms, just to show 
how it might be possible for Einstein’s field equation to come out of the entropy formula. By necessity, I will 
gloss over a great many technicalities and am intentionally vague at times. The following should be read more 
as an enticement to look into the original literature than as an explanation. 

Consider an infinitesimal amount of heat Q going through a causal horizon: 5Q is given by integrating the 
energy flux, which depends on T””, over the area A of the horizon. As a result, the area A changes, but the 
change in area 5A is determined geometrically by light rays near the horizon converging toward or diverging 
away from one another. In chapter IX.3, starting from the geodesic equation, we will derive an equation (known 
as the Raychaudhuri equation) governing the amount by which light rays converge or diverge. You would expect 
this to be determined by the curvature of spacetime. Indeed, the Ricci tensor R“” appears in the Raychaudhuri 
equation. The entropy formula S = A (suppressing the irrelevant overall constant or choosing sensible units) 
tells us how 5S (which is related to 6 Q) is related to 5A, and thus how T“” is related to R“”. The relation, perhaps 
not surprisingly, turns out to be Einstein’s field equation. 

I have brushed over a host of technicalities, but should have persuaded you that it is at least conceivable 
that Einstein’s field equation could come out of the entropy formula and thermodynamics. Let me say it more 
colloquially. The formula S = A (rather than S « V) is incredibly special and weird; how could the entropy possibly 
be proportional to the area!!? Well, the physics of gravity has to be arranged in precisely such a way so that it 
holds. (Jacobson intended his argument to hold for any spacetime, but for pedagogical clarity, I have focused 
here on a black hole.) 

Oh dear, if this view is correct,2* then Einstein’s field equation is demoted to the status of PV = NT, a mere 
equation of state. If so, it may have important consequences. To quote Jacobson, “This perspective suggests that 
it may be no more appropriate to quantize the Einstein equation than it would be to quantize the wave equation 
for sound in air.”?° 


Exercise 


1 Work out the heuristic calculation outlined in the text by SE and Confusio, thus obtaining another estimate 
of Ty: 


Notes 


1. This fact is sometimes somewhat misleadingly presented as something amazing about black holes. Actually, 
it follows essentially from the virial theorem (see exercise 10 in chapter IV.2) and is generic to gravitating 
systems, including stars. As a star loses its energy through radiation, it generically heats up. 

2. This discussion is taken from p. 3 of QFT Nut, to which you are referred for more details. 

3. To learn how quantum electrodynamics deals with the fluctuating photon, see QFT Nut, or any other 
reputable quantum field theory text. 

4. A definitive and detailed history of the discovery of Hawking radiation has yet to be written, as far as I 
know. At the time, a Russian group consisting of Y. Zel'dovich, A. Starobinsky, and others was actively 
working on radiation from rotating black holes (which we will discuss in chapter VII.5.) Unfortunately for 
them, they had convinced themselves that Schwarzschild black holes do not radiate, an entirely plausible 
supposition, since the Schwarzschild solution is static. Furthermore, even classically, rotating black holes 
emit particles through the Penrose process (also to be discussed in chapter VII.5). Meanwhile, Don Page, a 
graduate student at the California Institute of Technology, was also working on radiation from black holes 
and discussing his calculations with R. Feynman. Page independently discovered that rotating black 
holes radiate, and Feynman agreed with the conclusion, but then they discovered that Zel’dovich et al. 
had beaten them to it. Several others, including Bill Unruh and Larry Ford, were also working on simi- 
lar ideas. I was told that had Hawking not found the radiation from the Schwarzschild black hole, Unruh 
probably would have. Incidentally, Hawking’s original motivation was actually to prove that Bekenstein’s 
proposal that black holes had entropy was wrong. I am grateful to Gary Gibbons and Don Page for personal 
accounts of the events surrounding the discovery of black hole radiation. 

5. This suggests another way of understanding Hawking radiation, in terms of quantum tunneling. Work out 
the wave equation of a quantum particle in the Schwarzschild metric. Classically, a particle inside the horizon 
trying to get out is faced with a potential barrier, but a quantum particle could tunnel through the barrier. 
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W. G. Unruh, Phys. Rev. D14 (1976), p. 870. 
See A. Almheiri, D. Marolf, J. Polchinski, and J. Sully, “Black Holes: Complementarity or Firewalls>” 
arXiv:1207.3123v2. Note also the papers generated in response to this paper. 


. J.A. Wheeler, A Journey into Gravity and Spacetime, Scientific American Library, W. H. Freeman, 1990, p. 221. 
. Strictly speaking, since counting is involved, entropy is a concept of quantum statistical mechanics and does 


not make complete sense in classical physics. 


. See, for example, R. Feynman, Statistical Mechanics. 
. Or surround a black hole of mass M with a spherical shell consisting of a large number of black holes with 


masses M,, which we allow to fall into the central black hole, giving us a black hole of mass M + )° j M;. 
2 

Then (m+ ar M;) > M? +>; M2. 

Even in the Hawking radiation of gravitons from a black hole, we can imagine cutting the metric into two 


pieces, a classical piece plus a small quantum piece, small in the sense that we can ignore the interaction 
between the gravitons. This type of procedure will be used in discussing gravitational waves in chapter IX.4. 


I heard this argument from G. Dvali. 

That is, in the spirit of the famous book Consider a Spherical Cow by J. Harte, consider a spherical you. 
Indeed, at one point, a contentious subject revolved around what you would expect to see: peculiar remnants 
or nothing. 

A. Strominger and C. Vafa, Phys. Lett. B 379 (1996), pp. 99-104, arXiv:hep-th/9601029. 

A. Strominger, private communication. 

This sentence is not intended to make sense in the context of this book. For a detailed explanation, see, for 
example, chapter I.2 of QFT Nut. 

Wigner had them sitting at a bar. 

W. Unruh, Phys. Rev. D 14 (1976), p. 870. 

I heard this from Bill Unruh (private communication). 

For example, QFT Nut, p. 63. 

T. Jacobson, Phys. Rev. Lett. 75 (1995), p. 1260; arXiv: 9504004v2, 1112.6215v2. See also T. Padmanabhan, 
arXiv: 0911.5004. For more recent work, see E. P. Verlinde, JHEP 1104:029 (201 1). 

There are skeptics. One relativist I talked to scoffed that this merely proved that some people were able to 
run the relevant equations backward. You judge for yourself. 

T. Jacobson, Phys. Rev. Lett. 75 (1995), p. 1260. 
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Interior of stars 


In this chapter we study what general relativity has to say! about the interiors of stars. We 
are going to deal with only the most idealized situation. The magnificent complications of 
stellar interior dynamics are far beyond the scope of this book. 

In the simplest model, the star is assumed to be perfectly spherical (and hence nonrotat- 
ing), with its interior consisting of a perfect fluid, a notion we defined way back in chapter 
III.6. There we derived the energy momentum tensor of a perfect fluid in flat spacetime 
TY’ = (p+ P)U*U” + Pn", with U" the local 4-velocity of the fluid. Once again, behold 
the power of the equivalence principle! We merely have to promote ny” to g” to obtain 


T’ = (p+ P)UYU" + Pg” (1) 


for curved spacetime. 
Let’s plug this into Einstein’s field equation 


Ruy =+k (Tu» = 18uT) (2) 
where we have introduced the shorthand « = 87G. With T = —p + 3P, we have 

Rate [« £ PIO Uy #2 0= P)8,0| (3) 

Assume a static spherically symmetric interior described by the metric 

ds? = —A(r)dt* + B(r)dr* + r?2dQ? (4) 


The resulting Ricci tensor was calculated back in chapter VI.3. Recall that R,,,, is diagonal 
and that Ryy = sin? 0 Ryo. 
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Solving for the spacetime inside the star 


The static spherical symmetry implies physically that the fluid can’t flow, so that U! = 0. 
This is also forced upon us by (3) and the vanishing of Ro;. The normalization condition 
8uyU"U” = —-1=—A(r)(U%)? gives U9 = A~? and Uy = goo = —AA~? = —A?. Thus, 
the field equation (3) implies 


A” Al At A’ B’ 1 
Ry = + + =k(p+3P)A 5 
"2B rB 4B (4 =) 2«(0 ) °) 
A? B’ A’ A’ B’ 1 
R,, = caaeere + = 1k(p — P)B 6 
fa 2A rB 4A (4 ) astee es) (6) 
1 r A’ B 1 2 
R 1 =sk(p—P)r 7 
v8 B 2B (4 ) gen E) (7) 


We are to solve these three coupled ordinary differential equations.” 


Back in chapter VI.3, we had the easier problem of solving (5-7) with their right hand 
sides set to zero. Let us form the same combination Ru + Bee + aye that served us well 
there. We find the equation (1— 4 + ‘B) = xr*p, witha right hand side, though no longer 
zero, depending only on p. Inspired by the Schwarzschild solution, we define a mass 


function M(r) by 


D 1- 2GM(r) 


B r (8) 


Inserting this into the equation for B, we find 


——— =4nrp(r) (9) 


aM(r) 
dr 


which we can integrate immediately to obtain 
M(r) = 4x i dr'r? p(r') (10) 
0 


According to the Bianchi identity (as explained in chapter VI.5), we can trade one of 
the field equations (5-7) for D,,T“” = 0. Plugging in the perfect fluid energy momentum 
tensor, noting that D,,g*” = 0, and using the expression for the covariant divergence of a 
tensor DT” = Fz u(/—8T") + Baw Kase we have 


0=D, {(o+ P)U“U” + Pg} 


1 
= —=4, {Vale + P)UYU"} +), (0 + P)UYU* + BA, P 


fe: 

p+Piy | owadP 

= T+ 11 
A 08 Gy (11) 


The last equality follows since the only nonzero component of U“ is U°, and the assumed 
spherical symmetry implies that various quantities, such as the pressure P, depend on 
r only. 
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Looking up the list of Christoffel symbols, we see that the only nonvanishing I}, is 


a= a so that we obtain the condition of hydrostatic equilibrium on the pressure 
gradient P’ = 4": 
P’ A’ 
soll (12) 


Stellar equilibrium 


Let’s keep count. Of the three field equations (5-7), we have effectively used two, massaging 
them into (10) and (12). We choose (7) as our final equation, from which we eliminate 
B and A using (8) and (10), and (12), respectively. After some algebra, we arrive at the 
Tolman-Oppenheimer-Volkoff* equation for relativistic stellar structure 


dP GM) (1+ sO) (: ‘nr'Po) (: a): 
p(r) Mr) 


dr r2 
We also have to specify what the star is made of by giving the equation of state P = P(p), 


(13) 


r 


relating pressure to density. 

We can now work out the stellar structure in this idealized model. Given some equation 
of state, eliminate p in terms of P, and then integrate the two coupled first order differential 
equations (13) and (9) for P(r) and M(r). For some simple P = P(p), analytic solutions 
may be found, but in general, it is necessary to numerically integrate outward from r = 0 
with the boundary condition M(r = 0) = 0 (obviously) and some chosen value of the 
central pressure P(r = 0) or equivalently, some central density p(r = 0). From (13), we see 
that P(r) steadily decreases until it vanishes at some radius R, which defines the radius of 
the star. (The pressure vanishes in empty space outside the star, and so, if P(R) 4 0, there 
would be an infinite pressure gradient at the surface, which is not physically acceptable.) 
In other words, the radius R of the star is determined by P(R) = 0. The mass of the star 
is then given by M = M(R). Thus, there is a one-parameter family of solutions, with the 
mass M and radius R of the star dependent on P(r = 0). 

At this point, we can also determine the spacetime inside the star. Already, B(r) is given 
by (8). We insert (13) into (12) to obtain 


Al _ 2GM(r) (14 ‘ar Pon) (1 Zemin" 4) 


A r? Mr) 


which typically we would have to integrate numerically inside the star. 


r 


Outside the star, however, M(r) = M and P(r) =0, so we can integrate (14) almost 
instantly to give A = 1— 2GM | Nicely, the interior solution joins on with the Schwarzschild 
solution with rg = 2GM = 2G.M(R). This verifies the Newton-Jebsen-Birkhoff theorem yet 
again. 

The Tolman-Oppenheimer-Volkoff equation is written in a particularly attractive form 
in (13) to exhibit the Newtonian limit explicitly. To see this, restore c by high school 
dimensional reasoning. For example, denoting the dimension of P by [P], we have 
[P] = [force/area] = [(ML/T?)/L?] =[M/(LT?)]. Similarly, [o] = [M/L*). Thus, [P/p]= 
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[((L/T)?], so that with nonrelativistic units, the expression in the first parentheses in (13) 
should be written as (1+ oa) which tends to 1 as c > oo. (Actually, you know this al- 
ready if you recall from chapter III.6 that for a nonrelativistic gas, the pressure is negligible 
compared to the mass density.) You can convince yourself that the other two expressions 
in parentheses in (13) also tend to 1 as c > oo. Therefore, in the nonrelativistic limit the 
Tolman-Oppenheimer-Volkoff equation reduces to Newton's equation for stellar structure 
dP GMp 
dP 


(15) 


which just says that the outward force due to pressure on an infinitesimal volume of stellar 
material balances the inward force due to gravity. To see this, visualize a thin slab of stellar 
material of cross-sectional area dA and bounded between r and r + dr. The net force due 
to pressure is given by P(r)dA — P(r + dr)dA = —“drdA, and the force due to gravity 
by GM(pdrdA)/r*. Note that here we have to invoke Newton's two “superb theorems” 
explained way back in our very first chapter, chapter I.1. 

Quite remarkably, there is none of this talk about forces in the field equation (3), just 
a statement about how the energy momentum tensor curves spacetime. A lot of physics 
lurks secretly inside (3). To me, that’s part of the magic of theoretical physics. 


Buchdahl’s theorem 


A particularly simple (but somewhat unphysical) equation of state is that of an incom- 
pressible fluid, namely p equal to a constant independent of the pressure. Then (10) may 
be trivially evaluated, giving M(r) = (477/3)r3p = (r/R)>M, where the radius of the star R 
is determined by P(R) = 0, as explained earlier. The mass of the star M is equal to M(R). 
Evidently, we will encounter the combination GM, and so it is convenient to introduce the 
symbol rs = 2GM, even though we are talking about a star, rather than a black hole, here. 

Things are now sufficiently simple for us to integrate the Tolman-Oppenheimer-Volkoff 
equation (13) analytically. We obtain 


1 


(1-8 (@))'- 0-9) 


P(r)=p ; , (16) 
3(1— 8)? — (1-8 (R)) 
The scale of P(r) is set by the constant density p. 
For the central pressure P(0) = p iy to be positive, we must have 3(1 - =) 2 >1, 
with P(0) blowing up when the Pee aie an equality. Thus, we require 
R> 3rs=3GM (17) 


Buchdahl’s theorem states that for any “reasonable” equation of state, the inequality (17) 
holds. Recall the criterion rg > R for the star to become a black hole. Thus, Buchdahl’s star 
can never become a black hole. 
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Stellar collapse into black holes 


Chandrasekhar . . . shows that a star of mass greater than a 
certain limit M ... has to go on radiating and radiating and 
contracting and contracting until, | suppose, it gets to a few 
km. radius, when gravity becomes strong enough to hold in the 


radiation, and the star can at last find peace. . . . | think, there 
should be a law of nature to prevent a star from behaving in this 
absurd way. 


—Arthur S. Eddington, comments at the Royal Astronomical 
Society Meeting, on January 11, 1935 


As I have already said, a detailed analysis of stellar equilibrium using various equations 
of state P(p) is beyond the scope of this text, even though, as the reader probably knows, 
the results from such an analysis constitute some of the most spectacular highlights of 
stellar astrophysics. For example, if the pressure is supplied by the quantum motion of the 
electrons in the star, namely the Fermi pressure of degenerate electrons (see any text on 
statistical mechanics), the stellar mass M cannot exceed an upper limit of about 1.4Mo, 
known as the Chandrasekhar limit. 

To study the collapse ofa spherical cloud of matter into a black hole requires generalizing 
the metric (4) used here to the time dependent metric mentioned in appendix 3 to chapter 
V1.3. The analysis of the resulting Einstein field equation becomes considerably more 
complicated. 

I am content to point out a key physical feature of the relativistic equation for stellar 
equilibrium. We see that the expression in the first parentheses in (13) effectively changes 
the mass density p in Newtonian gravity (15) to p + P. This important piece of physics can 
be traced all the way back to special relativity: pressure, being an energy density, counts 
also as a mass density. Thus, as infalling matter piles onto a superdense star and squeezes 
it gravitationally, the star resists by increasing its internal pressure P, which only adds to 
the mass density bearing in. In essence, this vicious cycle is at the root of the physics of 
black hole formation. 


Gravitational binding energy and Einstein getting almost run over 


Now that you are a budding relativist familiar with the Schwarzschild solution, weren't you 
pleased to see (8) and (9) appearing? But at the same time, did you not find the expression 
for the total mass 


R 
M = M(R) = 40 i drr’p(r) (18) 
0 


a bit odd? At first sight, it looks like the sum of the infinitesimal mass elements that make 
up the star, but then you realize that space is curved! 
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At a fixed instant in time t, space as a slice of spacetime is described (from (4)) by the 
line element B(r)dr? + r2dQ?. Thus, some authors choose to define the integrated mass 


M=An ik drr?__ 2) ___ (19) 


[1 — 26M) 


and regard the difference M — M > 0 as the gravitational binding energy of the star. I 
must emphasize, though, that I know of no simple experiment that would measure M. 
In contrast, since the exterior Schwarzschild geometry is determined by M, a distant 
astronomer knowing Kepler’s laws and measuring the period of a planet orbiting the star 
would end up calculating M. 

I end this chapter by recounting a story told by George Gamow* in his autobiography. 
While crossing a street, Gamow mentioned to Einstein that Pascual Jordan had realized 
that a star could be made of nothing if its negative gravitational energy balances its positive 
rest mass energy. According to Gamow, “Einstein stopped in his tracks and . . . several cars 
had to stop to avoid running us down.” 


Appendix: The expanding universe again 


I’d like to mention something amusing here. An astute reader might notice that the setup in this chapter, with the 
metric ds? = —A(r)dt* + B(r)dr? + r2dQ? in (4), also allows us to solve the equation Ry, = —87GAg,,, that we 
solved in chapters VI.2 and VI.5. We can solve for a universe filled with the cosmological constant (which may well 
be the dark energy, as mentioned in chapter VI.2). For p = A, we have immediately M = 4x Ar3 /3. Furthermore, 
with P = —A, the field equation (7) gives A = 1/B, where we have absorbed an integration constant. After some 


algebra, we find that the metric in (4) works out to be 


ds?=— (1-H?) ae? + (1- #7?) dr? + r2dQ? (20) 


with H? = 87GA/3 (as in chapter VI.5). 
“What is going on?” you exclaim. Back in chapters VI.2 and VL.5, filling the universe with a cosmological 
constant gave an exponentially expanding universe described by 


ds? = —dt® + e2#! (dx? + dy? + dz”) (21) 


But what is expanding in (20)? That metric does not even depend on time! I will let you think about that one for 
a moment. 

It is the magic of coordinate transformation, of course! You can transform (20) into (21). Try it. The precise 
relationship between these two apparently entirely different metrics will be revealed in chapter IX.10: they both 
describe what is called de Sitter spacetime. 

Another astute reader might notice that the solution here is not the most general. The equation (9) ee = 
4zr*A also allows the solution M = M + 42 Ar3/3, with an arbitrary additive constant* M. Using (7), you can 


check that A = 1/B continues to hold. Thus, another solution is given by 


ra el eee La (22) 
r 


Interestingly, we can put’ a black hole in de Sitter spacetime. 


* Not to be confused with the mass of the star, of course. We are now talking about an entirely different physical 
situation. This additive constant is not allowed in the context of the stellar problem, since spacetime is required 
to be nonsingular at the center r = 0 of the star. 

} This fact turns out to be of great relevance to recent developments in theoretical physics. 
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Exercises 


1 Derive (16). 


2 Find some analytic solutions of the Tolman-Oppenheimer-Volkoff equation. For help, read section 3 of 
Tolman’s 1939 paper cited in the endnotes. 


Notes 


1. Actually, most stars never evolve into structures dense enough for general relativity to play a role. 

2. It's perhaps worth remarking somewhere, so it might as well be here, that Einstein’s quip regarding the 
left hand side of his field equation versus the right hand side (mentioned in chapter VI.5) is not as clear-cut 
as it sounds upon first hearing: in general, T,,,, depends on the metric, so that geometry does not appear 
exclusively on the left side. Here A and B appear explicitly on the right side of (5) and (6). 

3. R.C. Tolman, Phys. Rev. 55 (1939), p. 364; J. R. Oppenheimer and G. M. Volkoff, Phys. Rev. 55 (1939), p. 374. 

4. G. Gamow, My World Line. 
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Rotating bodies: General considerations 


Rotating black holes are important for practical and theoretical reasons. 

Astrophysical objects invariably rotate due to the chaotic way they are formed. For 
normal stars, such as the sun, the amount of rotation, as measured by the rotational speed 
v at the surface divided by c, is negligible. That’s why we are able to use the Schwarzschild 
metric to describe the spacetime outside the sun. Around a black hole, however, infalling 
debris invariably causes the black hole to spin, as we saw in chapter VII.1. Several important 
astrophysical processes appear to be powered by rotating black holes. One of our goals in 
this chapter is to understand how rotating black holes can be such powerful sources of 
radiation. 

Historically, the Schwarzschild horizon bothered the founding fathers of general rela- 
tivity so much that some of them suggested that its presence is an artifact of the spherical 
symmetry and that a rotating black hole would be free of such bizarre features. The discov- 
ery of a rotating black hole solution of Einstein’s field equation by Roy Kerr in 1963 finally 
put this supposition to rest. 

This chapter may be skipped over upon first reading; a first understanding of Einstein 
gravity does not require mastering the Kerr solution. Like the Schwarzschild spacetime, 
the Kerr spacetime is a solution of Einstein’s field equation R,,,, = 0 in empty spacetime. 
We anticipate that there could be a horizon. A rotating object small enough to fit inside its 
own horizon is known as a Kerr black hole. 

Before we look at Kerr’s specific solution, let us see how far we can get with general 
considerations. We assume stationary and cylindrical symmetry. Stationary means that 
the object rotates with a constant angular velocity, so that the spacetime does not change 
with time. With the usual coordinates (¢, r, 6, g), the metric components g,,,(7, 6) are 
functions only ofr and 6, but not of t and g. Furthermore, the solution must be unchanged 
under the discrete transformation t > —t together with yg > —g. This rules out in ds? 
cross terms such as dtdr, drdg, and so forth, allowing only dtdg. (Convince yourself 
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that by a coordinate transformation (see exercise 1), you can also eliminate the cross term 
drd@.) Hence in general, the spacetime outside is described by 


ds? = g,,dt’ + g,,dr? + g9gd0" + Syydy? + 2g, 9dtdg (1) 


with 211, 8rr> 80> Sop Sto = Sy five functions of r and 6. The appearance of the off 
diagonal component g,, = Ber in the metric will lead to fascinating new physics. 

The term g,,dtdg in ds? means that ds? is no longer invariant under t > —t, as was 
the case with the Schwarzschild solution. That the spacetime is invariant only under the 
combined transformation t — —t and g — —@ is a hallmark of rotation about the z-axis. 

Note that we can also write 


2 
&; 
ds* = («, = ‘| dt? + guy (dy — wdt)” + g,,dr? + gggd0? (2) 
Soy 
with w(r, 0) = —819/8y9- 
That the metric g,,, (7, 9) does not depend on t and g immediately implies that two of the 
equations of motion amount to conservation laws. Varying the action for a point particle 


I 


S= / (a0? + 8rrdr? + Bo9d0? + Byydy? + 2eydtdy) ° (3) 


with respect to t gives the geodesic oe £ eee 5 + 8to a) — = 0, and with respect to 


y gives another geodesic equation 4 (Sor ae Sap ag) — = 0, which we recognize as energy 
and angular momentum conservation, respectively. In other words, the quantities 


dt dp 
ce (0.5 + e992) (4) 


and 


dt dg 
t= Bea ee (5) 
do not change along geodesics. More explicitly, a particle of mass m has momentum p” = 
dx! 
dt’ 
(note the lowered indices). 


m<_, and its motion in spacetime conserves the components p; = g;,p” and Po = 8ov p° 
More formally, our spacetime is isometric* under t > ¢ + constant and g > y+ 
constant, and thus possesses two Killing vectors! €, = (1, 0, 0, 0) and é = (0, 0, 0, 1). The 
two quantities E = —¢,- p= —E!p, = —p, = —(8uP' + 8igp®) and L=& - p=E/' py = 
1» = (8yrP' + 8gyP*), Corresponding to energy and angular momentum, respectively, are 


conserved for particles moving around in the spacetime. (The quantities E and L are 
simply € and /, respectively, multiplied by m.) 

Itis worthwhile to comment on the signs in (4), (5), and the expressions just given for E 
and L. Inthe (— + ++) convention used here, g,,. > 0. We define the angular momentum 


* We will explore isometry in detail in chapter IX.6. For now the term “iso + metry” simply means that the 
geometry stays the same. 
T The rich man would want to write €, = EHO, 


= a and ¢ = ¢/'d,, = oa as I explained back in chapter V.5. 
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Figure 1 Frame dragging due to spacetime being deformed by a 
rotating body. 


of a particle whose ¢ coordinate increases with increasing proper time t as positive. Hence, 
angular momentum is +)y. 


Frame dragging 


From this general form, we can immediately deduce an interesting consequence. The 
novel feature is that the angular momentum / = g,, a +t By ae now consists of two terms, 
thanks to the presence of the nondiagonal term g,,. 

Drop a particle, massive or massless,* from far away, with vanishing initial angular 
momentum! / = 0, toward the rotating body (notice that, thus far in the discussion, 
nothing requires that the metric be that of a black hole). Far away, we expect spacetime to 
approach Minkowskian, so that g,, — 0, gg, > 1. Thus / = 0 means that ae — 0 far away, 
as expected. 

But as this particle approaches the rotating body, since the angular momentum /, being 
conserved, stays at 0, the particle picks up, according to (5), a position dependent angular 
velocity 


or, 6) = = “| =- (6 
dt dtf dt Sop 


Angular velocity without angular momentum! 

Note that this angular velocity is defined by the rate of change of gy with respect to 
coordinate time, not proper time. Furthermore, recall that w(r, 0) has already been defined 
in (2): this discussion reveals its physical meaning. 

We interpret this peculiar phenomenon, known as frame dragging, as due to spacetime 
being deformed by the rotating body (see figure 1). We fix the direction of rotation by taking 
819 < 0 So that w > 0 (since g,, > 0 in our (— + ++) convention). 


* For a massless particle, the parameter t should be interpreted as an affine parameter, not proper time, as 
has already been explained a number of times. 

¥ As always, we prefer to be less wordy at the cost of some loss of precision. Thus, we generally eschew saying 
things like “angular momentum per unit mass” if we can get away with it without confusing anybody. 
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It is worth emphasizing that frame dragging is not a mysterious effect associated 
somehow only with black holes, as some people confusedly think. It is a general relativistic 
effect, which does not occur in Newtonian gravity, generated by any rotating massive body. 
In particular, the earth drags the spacetime frame around it. 

The Kerr solution turns out to be arithmetically complex, and so it would be advisable 
to do a back-of-the-envelope estimate of this effect at this point. Let the body rotate with 
angular velocity Q ~ v/R, where R denotes the object’s characteristic size. Through gravity 
the object curves the spacetime around it, which causes the particle, in seeking the best 
deal (namely the geodesic), to be! “dragged along.” The strength of gravity is characterized 
by the dimensionless ratio GM/Rc?, as we have seen many times. Thus, we might expect 


GM GM 
ee oha~ (S) (2) 7) 


Let us also anticipate how the Kerr solution would be parametrized. In units with 


G=1andc=1, M has dimensions of length, and angular momentum J ~ MvR has 
dimensions of length squared. Let us measure rotation by the length a = J/M. Indeed, 
it is convenient to continue using the length rs = 2GM = 2M, even though we are not 
dealing with the Schwarzschild solution here. We then have the dimensionless measure 
of rotation 2a/rs = J/M?, which is of order ~ v/c if we set R~ rg ~ M. 


Stationary limit surface 


Consider a light ray emitted, initially with dr = 0 and dé = 0, from some point. We have 
initially 0 = ds? = g,,dt? + 28,,dtdp + Spd yr. Solving this quadratic equation for dg, 


we obtain 
2 2 
dp 8tp + \/ 819" — 8899 __ 8 4 (&) Qt 8) 
dt Sy Sy So Sy 
To save writing, it is customary to define, as in (6), @ = —8;y/8 gy > 0 (which we note is a 


function of r and 6). 
Far away from the rotating body or black hole, with g,, < 0 and g,, > 0, we have two 


roots 
d 
a, = (2) =. |r El 9 (9) 
dt} + \ 809 
and 
d 
a= (2) =o |ot+|44#| <0 (10) 
dt} _ \ Soy 


(Recall that w > 0.) These two quantities, Q, and Q_, one positive and one negative, 
describe two light rays, known as corotating and counterrotating, respectively, emitted 
in the same and in the opposite direction as the direction of rotation. Note that, except 
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Figure 2 A schematic plot of Q, and Q_ in the equatorial plane as a function 
of r for a Kerr black hole with rg = 3 and a = 1 (and hence rg, = rg = 3 


and r, ~ 2.618). The upper and lower curves correspond to Q, and Q_, 
respectively. On the stationary limit surface r=rg,, g,, vanishes, so that 
Q4 = 2w and Q_ = 0: the lower curve crosses the horizontal axis. The rotating 
body has caused the counterrotating light on the stationary limit surface to 
stand still. On the outer horizon, r =r,, Q, = Q_, and the upper and lower 
curves meet. 


in the equatorial plane 6 = 7/2, light rays do not maintain dé = 0; hence in general, Q4 


denotes the angular velocities at emission only. 

It is worthwhile to note that the discussion here, which involves light, is not be confused 
with the discussion of frame dragging in the preceding section, which applies to massive 
as well as massless particles. In particular, there we specialized to / = 0 for simplicity. 
In contrast, / is not specified here. Students sometimes confound these two distinct 
discussions, since some of the same metric components, g,, and g,,, are involved. Note, 
however, that g,,, which appears in (9) and (10), did not enter into the discussion in the 
preceding section. 

Indeed, we are now going to talk about g,,. Far away, g,, ~~ —1 <0. Suppose that, as 
we come in closer, there exists a surface, known as a stationary limit surface, on which 
8+; = 0. Then on that surface, Q, = 2m and Q_ =0. The rotating body has caused the 
counterrotating light on the stationary limit surface to stand still! 

As we will see, in the Kerr solution, g,, does vanish as we come in. In figure 2, we plot 
Q, and Q_ for a particular Kerr black hole. 

By the discussion back in chapter V.4, the g,, = 0 surface is also the surface of infinite 
redshift. It is worth emphasizing that for a rotating black hole, there is no a priori reason 
why g,,. must blow up where g,, vanishes, as in the Schwarzschild solution. 

Even closer in, g,, turns positive. Both 


Su 
Soy 


Q,=04+ for— and Q_=a- /a*- 
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are now positive. Even light can no longer move in the direction opposite to that of the 
rotation. All particles are swept along, hence the term “stationary limit surface.” Inside 
this surface, you are swept along with the flow no matter how powerful your rocket pack 
might be. Everything is moving in the same direction as the central body rotates. 

A clarifying remark here might be helpful. The angular velocities Q4 = (@), refer to 


how the test particle moves through the coordinate ¢ fixed with respect to the stars. Relative 


do 


dte- ow remain 


to the frame being dragged along, Q, —w= (2), —qwandQ_-w=( 
positive and negative, respectively. 

Moving ever closer in, we may reach a point at which ee — 811899 = 0: on this surface, 
corotating and counterrotating light are emitted with the same angular velocity Q), = Q, = 
Q_. You have no choice as an observer: your angular velocity Qopserver, Squeezed between 


Q, and Q_, must be equal to Q;,. Guess what the subscript H signifies. 


Three regimes 


In summary, we have described three regimes: (I) with Q, positive and Q_ negative, you 
can move left or move right within limits; (II) with Q, and Q_ both positive, you are forced 
to move right; and finally, (IH) with Q, = Q_ both positive, you are forced to move right 
in lockstep with everybody else. 
Regarding g,,, as the matrix 
Sit Sty 0 0 
8p Sop 9 0 


0 0 90 g6%% 
we readily recognize the combination 
2 
D= 819 — 81899 (11) 


that appears in (8) as minus the determinant of the 2-by-2 submatrix in the upper left 
corner. Note how D also appears in (2). The inverse matrix g“” is given by 


=D" "80 D gia 0 0 
My Dig —-D"8,, 0 0 (12) 
g = 
0 0 Wg, 0 
0 0 0 1/866 


Thus, at D = 0, the inverse metric g"” ceases to exist. Note also that g’” = 1/g,,. 


Falling into a rotating black hole 


We have tracked the angular velocity of a test particle of mass m falling into a rotating black 
hole. What about the other conserved quantity, its energy E? 
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One subtlety is that, because of the presence of the unfamiliar cross term g,, in the 
metric, we have to clarify in our own heads whether it is p, or p' that is conserved. The 
conserved guy is in fact p, (which we identified as — E), as we have mentioned in passing. 

We can solve the mass-shell condition p? = SuvP" Pp” = gp, Py = —m’ for E, as usual. 
The key is to use the form g’”p,,p, for p*, since as I just explained, p, (and Py) are 


the conserved quantities we want to work with. Using the inverse metric (12), we write 


g"” py py = —m? as 


Pr. Po 
—8p9P; + 28:9P:Pp — 8uP, + D (Z +24 m) =0 (13) 
grr §00 
with D = gy) — 8118 py a8 defined in (11). Note the oddly mismatched indices. Since neither 
2 
P; = 8p" nor p” is conserved, we might as well write ze = g,,(p’)* = mg,,(42)?. In this 
way, we express the last three terms in (13) as K = Dm(¢,,(4£)? + nea + 1). 


Solving the quadratic equation (13) for p,, we obtain 


&t 1 
E=—p=——*p, + | ((#2, - 811899) p2+ 8eoK) (14) 
8 ce 


Note that, importantly, we have chosen the + root, since far from the black hole (or for a 
weakly rotating body), where spacetime becomes Minkowskian and sanity is restored, this 
expression corresponds* to the correct E = +,/p?+ m2, rather than to* E = —,/p? + m?. 
Once again, the weird-looking feature in (14), namely a term outside the square root, is 
due to the presence of the cross term g,, in the metric. We will return to this expression 
in appendix 1. 


The Kerr black hole 


Thus far, we have been squeezing physics out of the general stationary cylindrically 
symmetric spacetime in (1). To go further and to see that what we say would happen actually 
happens, we need the specific metric of a rotating black hole. 

In 1963, Kerr found a solution? of Einstein’s field equation R,,, = 0 characterized by two 
parameters r, and a with dimension of length: 


2 bs 2 6 2 
ds? = (1 “s) dt? — SS SS atdy + 9dr? + p?a6? 
p p A 


2 +2 
rea’r sin’ O\ 
+ (7? 4a? 4+ = ——— | sin’ ody’ (15) 
p 
with 
p=r't+a*cos’@ and A=r’+a’—rrg (16) 
* Take a particle at rest far away, so that ge 0, a 0, Pp =9, 899 = 1, 81g = 9, Bir 1, D=1, K =m’, 


and so E reduces to +V m2. 
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Note that in this chapter, rs is merely a convenient shorthand borrowed from our discus- 
sion of the Schwarzschild black hole: nothing much happens to (15) at r ~ rs. 

Far away from the black hole, as r > oo, spacetime approaches Minkowskian flat, as 
we would expect. In particular, go) > —(1— “S). Using a result we learned as far back as 
chapter IV.3, we identify M = srs as the mass of the rotating black hole. 

Our friend the Smart Experimentalist explains, “The mass of an astrophysical object 
is not some symbol in a theoretical expression, but a quantity we should inquire of 
the astronomer peering through the telescope. The astronomer tells us that the mass is 
deduced from the orbit of a test particle, typically a planet, circling that object. Ultimately, 
that’s related to the O(1/r) term in gq.” 

Excellent! That’s the operational definition of mass. Similarly, the astronomer could 
deduce the angular momentum of the astrophysical object in principle (if not in practice) 
by watching how a gyroscope” circling the object precesses. That precession is governed 
by the asymptotic behavior of g,,; in particular, for the Kerr solution, we detect a deviation 
from Minkowski spacetime given by 


2rsar sin? 6 


2rsa sin? 6 
> 
p2 


—28,,dtdp = 


dtdy dtdy = AS dtxdy — ydx) (17) 
upon reverting back, in the last step, to the usual Cartesian coordinates. The angular mo- 
mentum of the object is defined to be J = 5rsa = Ma. Putting this into the form of the 
metric in (2), we have ds* = (- +r? sin” O(dp - 24 qt)* +-- ‘), since © = —814/8 yy > 
(rsa sin? 0/r)/r? sin? 6 =r,a/r? = 2J/r>. (Note that as remarked earlier, J has dimen- 
sions of length squared.) We should show at some point that this definition of J reduces 
in the appropriate limit to what we commonly understand to be angular momentum; we 
will do this in chapter IX.4. Thus, 


J 2J 
i — el 


Tae (18) 


In short, the Kerr solution is characterized by two lengths, rs and a, corresponding to mass 
and angular momentum, respectively. 

Considering that it took* almost 50 years for this solution to be found (while the 
Schwarzschild solution was found within a year of 1915), we realize that the Kerr met- 
ric represents a highly nontrivial accomplishment.> Unfortunately, there is not a sim- 
ple® derivation’ of the Kerr metric comparable to the straightforward derivation of the 
Schwarzschild metric. See appendix 2 for a possible approach. Of course, it is straightfor- 
ward, particularly with the help of a computer, to verify that (15) is in fact a solution. 

For this text, Iam content to merely introduce you to some key features. 


1. Let’s check that our estimate of frame dragging in (7) is on the money: for a slowly rotating 
body, p~r and a= J/M ~ Mor/M ~ vr, and so indeed & ~ 819/899 ~ (rsar/p?)/1?2 ~ 
Ma/r? ~ Mv/r? ~ (GM/Rc?)(v/R), in agreement with (7). In the last step, we took R to be 


* We will discuss the precession of gyroscopes in chapter IX.2. 
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the size ofa black hole and restored G and c. If you like, you can regard this as a verification 


of (18) up to an overall factor, at least for a slowly rotating body. 
2. After admiring (15), we check that it reduces appropriately in various limits. 
a. As r — oo, the metric becomes asymptotically flat, as we have already noted. 
b. As a — 0, we recover the Schwarzschild solution: 


Sern = dst pay - (2rsa sin? 6/r) dtdg+O («’) (19) 


c. As a particularly interesting limit, take M > 0 and J > 0, keeping the ratio a = 
J/M fixed. We obtain 


ey aero) 
ds* =—dt’ 4 (: geeuENe 8 ar? + (7? + a? cos? 6) do? + (? + a’) sin? ody?) (20) 


Are you surprised that you do not recover flat space? But perhaps after a moment 
you remember appendix 2 from way way back in chapter 1.5: (20) in fact describes 


Minkowskian spacetime heavily disguised! 


3. The general discussion in the preceding sections, in particular the result (6), holds for the 
Kerr solution of course, with the off-diagonal metric component g,4(r, 9) = —rsar sin? 6/p?. 


A particle dropped with / = 0 from far away attains the angular velocity 


St rgar rgar 
or, 0) = = ; = 
Sop 2 (r?2 +. a2) + rsar sin? 6 (r2 + a2)? — Aa? sin? 6 


(21) 


with A defined in (16). 


4. The surface of infinite redshift g,, = 0, on which counterrotating light stands still, is given 


by p? =rrs, which has two solutions 
rs = 5 (s + \/r3 — 4a? cos? 0) =M+¥VM?—a’cos*6 (stationary limit g,, = 0) (22) 


There are thus two surfaces of infinite redshift, an outer and an inner. (Note that the S in 


rs is for stationary, while the S in rs is for Schwarzschild.) 


5. In accordance with our general discussion earlier, the Kerr metric is invariant under t > 
t + constant and gy > » + constant and thus possesses two Killing vectors €, = (1, 0, 0, 0) 
and é = (0, 0, 0, 1). 

6. The Kerr metric can be written in a number of different forms (see exercises 4 and 5). 

é : 2 
Define 5? = (r? + a”) p? + rgra? sin* 6 = (r? + a2)? — Aa? sin? 6, so that 809 = = sin? 6. 


Then, for example, using (2), we can write 


ds? = 


2, eo 2ip2. U2 2 
71 dt* 4 ru + p°dd~ + > sin 6(dy — wdt) (23) 
p 


Another form is given by 
2 


A 2 1 2 
ds? = —— (dt —a’* sin’ Ody) 4 dr? 4 p°do* } sin? 0 (r2 + a)dp —adt 24 
2 A 2 
p p 


Physical and coordinate singularities 


We see that the Kerr metric (15) is singular at p = 0 and at A=0. 
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rn 


Hey > ergoregion 


Figure 3 The surfaces r = r,. (as explained in the text, surfaces of constant 
r are not spheres) and the surfaces of infinite redshift r = r,,(0) are shown 


schematically. As indicated in the text, the surfaces touch pairwise at the north 
and south poles. The region enclosed between the stationary limit surface 
rg, and the horizon r, is known as the ergosphere or ergoregion. Inside 
the stationary limit surface of a Kerr black hole, the coordinate t becomes 
spacelike and energy morphs into momentum, but if you are outside the 
horizon, you can still get out if you want. 


In the limit a > 0, 9p = 0 reduces tor =0, and A=0 tor = rs. Thus, our experience 
with the Schwarzschild black hole suggests that p = 0 represents a physical singularity 
and A = 0 a coordinate singularity, a highly plausible supposition that we can check by 


calculating R R"*°*, for example. 


pLvpo 

The physical singularity corresponds to, according to (16), r* + a* cos” @ = 0, that is, 
r =0 and 6 = 5. But according to appendix 2 of chapter 1.5, this describes a ring of 
radius a. 


In contrast, the coordinate singularities at A = r? + a? — rrg = 0, that is, at 


25 (roby aet) =m M2 — a2 (25) 


~ 


describe two ellipsoids. See figure 3. 


Extremal black hole 


Note that (25) suggests, but does not prove, that |a| < M = 4rs, which according to (18), 
corresponds to the maximum angular momentum 


|J| < M2 (26) 


An extremal Kerr black hole is one with angular momentum |J| = M?. Most astrophysical 
black holes are observed to be nearly extremal, as would be expected if infalling debris 
tends to increase the angular momentum. Theoretically, extremal black holes also play an 
important role in string theory. Note that, heuristically, extremality is attained for MuR ~ 
Mv ~ M?, that is v ~ 1. So physically, it seems plausible that the angular momentum of a 
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black hole is bounded. For a given mass, you can’t keep on pumping angular momentum 
into the system. 


The outer horizon 


Let us locate ourselves at some point well outside the black hole (for r > r, say, so that 
A > 0) and watch an outgoing photon. Setting ds = 0 in (15), we find that dr is given by 
the positive square root of the quantity 


= (1 - “) de? 4. 2rsar sin” 8 dy pde? —(r2+a24 rsa’r sin’ @) 2 dy? 
p p? p? 


We want to see whether the photon manages to get out. So, we want to maximize dr. 
Evidently, to do this, we should set d0 = 0. But because of the cross term g,,, we should not 
set dg = 0, thatis, restrict our attention to radial light rays, as we did for the Schwarzschild 
black hole. Physically, this is clear. 

We now move in closer and closer to the black hole. At some point, the photon will not 
be able to get out. For r > rs/2, the quantity A = r* + a? — rrg decreases as r decreases, 
eventually vanishing, at which point dr = 0. The photon can no longer get out; this defines 
the horizon. According to (25), the vanishing of A occurs at r,. Our suspicion, based on 


applying the “correspondence principle” between the Kerr and Schwarzschild black holes, 
turns out to be valid. 

For definiteness, let us focus on what happens at r, (which, as we can see from (25), 
is indeed >rg/2). For further analysis, it is more convenient to use (23) rather than (15). 
With ds = 0 and dr = 0 in (23), set A = 0 to obtain 


Diam 2 
0= pide? + a sin’ 6 (dy — w,dt) (27) 
aa 


where the subscript + indicates that the various quantities are to be evaluated at r =r: 


2 2 559 _ rsary ss rgary a 2) 2 
pi=ri tacos 6, w,= = = : Ly=ry ta°=ryrs (28) 


2 2 
2 2 rus 


Now we easily solve (27) to find do = 0 and 


d 
OO facie ce (29) 
dt rs 


In other words, on the horizon, light rays move along the trajectory 


(dt, dr, d0, dy) xl" = (1 60.2 (30) 
rls 

The horizon is a null surface spanned by this null vector // and two spacelike vectors, 

which we can take to be h“ = (0, 0, 1, 0) and k# = (0, 0, 0, 1). Note that these vectors are 

orthogonal to /”: for instance, g,,,/"k” = 84 + 8y,@+ = 0 by virtual of the definition of w. 

Note that the normal to this null surface is // itself. (Recall the discussion of the light cone 

in Minkowski spacetime back in chapter III.3.) 
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Take a constant r slice of this null surface. We obtain from (27) a 2-dimensional surface 
with the line element 


2 
(+0) 


- 2 2 
gee G?) 
+ 


y2 
dS? = p'.d6? + — sin? ody = (1? + a cos*6) do” + 
Py 


Those readers with a good memory will recognize this as the squashed sphere you worked 
out in exercise 1.5.12. For an extremal black hole, the distance around a circle of fixed 
longitude through the poles works out to be ~3.82rs, considerably less than the distance 
(extremal or not) around the equator 2mrs, in accordance with our intuition about how 
spacetime might be squashed around a rotating black hole. Notice that all of this is 
happening to empty spacetime; what is being squashed is not a spinning material sphere. 
The area of this squashed sphere is given by 


A= / dodgy. /g = 4n (3 7 a’) = 8n (mw? + /M4— 7) (32) 


(with g the determinant of the metric in (31), as explained back in chapter I.5). As J > 0, 
we recover the Schwarzschild result A = 167 M?. 

Also, note that the combination D = g? y — Sit8yy that appeared inside the square root 
in (8) works out nicely to be A sin* 6 in the Kerr solution. Thus, the horizon (where A 
vanishes) acquires another significance: as we have learned, it is where the corotating and 
counterrotating light beams have the same angular velocity: 


Qy = Qa = Q_ 
a a 2a 
nore Ts! Ig (: + afte — 4?) 
= zd (33) 
2M (M?+VMF— J?) 


Ergoregion and the Penrose process 


To summarize, there are two surfaces of infinite redshift, an outer and an inner, 


1 ; ee 
ses (s + ,/r2 — 4a? cos? 0) =M+VM?-—a?cos?6 (stationary limit, g,, = 0) (34) 


There are also two horizons, an outer and an inner, 


_— ("s + ,/rg- 4?) =M+/M?-—da? (horizon, A=0) (35) 


2 


~ 


I also remind the reader that in the Kerr solution D = a — 818y9 =A sin* @ vanishes at 
the horizon. Hence, g,,.= —p*/A = —oo and g”" = Oat the horizon. 

For the rest of our discussion, we will focus on the outer stationary limit surface and the 
outer horizon. For the Kerr black hole, these two surfaces, rg, and r,, no longer coincide, 
as is the case for the corresponding surfaces for the Schwarzschild black hole. Comparing 
(34) and (35), we see that r, <rg,, with equality attained at the two poles. Thus, the outer 
horizon lies inside the outer stationary limit surface, touching at the two poles, while 
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the gap between the two surfaces is largest in the equatorial plane 6 = 1/2. We drop the 
modifier “outer” henceforth. 

The region enclosed between r, and rs, is known as the ergosphere (“ergo” being Greek 
for work, as in ergonomics and erg). The somewhat more accurate but awkward term 
ergoregion is also used. While particles cannot stay at rest in the ergoregion, they can still 
escape to infinity since they are still outside the outer horizon. See figure 3. 

The existence of a region between the stationary limit surface rs, and the horizon 
r, allows exotic new physics not possible with the Schwarzschild black hole. Inside the 
stationary limit surface of a Kerr black hole, strange things start happening: in particular, 
the coordinate t becomes spacelike, and energy morphs into momentum. But if you are 
outside the horizon, you can still get out to tell the tale! Much cooler to fall into a Kerr black 
hole than into a Schwarzschild black hole. 

Rather craftily, Penrose realized that these considerations allow us to extract energy from 
a rotating black hole. Consider a process in which a particle called “infalling” falls in freely 
from oo. In the ergoregion, it goes into two particles called “outgoing” and “doomed,” with 
their momenta arranged in such a way that while the doomed particle falls through the 
horizon, the outgoing particle escapes, moving freely along its geodesic to oo. This could be 
a subnuclear process, such as 7+ — wt + v (a pion decaying into a muon plus a neutrino), 
or an entirely classical process in which* we throw a bad guy out of our rocketship. 
Conservation of 4-momentum p;, = Pout + Pdoomea holds, of course. Furthermore, along 
the geodesics of the various particles, the quantities € - pin, € - Poy, and € - Paoomed are 
conserved, that is, they are constants of the motion. Here é can be either €, or €). (Iremind 
you that & = (1, 0, 0, 0) and & = (0, 0, 0, 1) denote the two Killing vectors that the Kerr 
spacetime possesses.) In other words, for each of the three particles, we have energy and 
angular momentum conservation. 

Confusio looks a bit bewildered. “It seems like we are talking about conservation in two 
distinct ways.” 

Indeed! The discussion in some textbooks appears to be confused on this issue. That the 
sums of the momenta are the same before and after some local process is a direct conse- 
quence of the equivalence principle, which says that in a small enough region of spacetime 
(in aneighborhood around where the pion decays, for example), we can choose coordinates 
so that physics is exactly as it would be in flat spacetime. More mathematically, it follows 
from D,,T“” = 0, which in turn follows from general covariance (as was discussed in chap- 
ter VI.5). In contrast, the constancy of & - p along various geodesics follows from specific 
properties of the spacetime we are in, namely its invariance under translation in t and 9. 

Let us contract the Killing vector €, with momentum conservation to obtain 


Ge * Pin = Se * Pout + Se * Pdoomed (36) 


Write E,,,(co) = —€, + Pin and Ey,,(00) = —E, - Poy to emphasize that these are the en- 
ergies that the infalling and outgoing particles have at oo, far away from the Kerr black 


* This reminds me of action movies or kung fu stories in which, as you and a bad guy fall off a cliff, you, 
being a physics student, give the bad guy a downward shove, and exploiting Newton’s law of action and reaction, 
bounce back onto the cliff. 
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hole. Of course, as always, being energies of physical particles, E;,(0o) and E4y,(0o) are 
necessarily positive. In sharp contrast, we write simply egoomed = —&e * Pdoomed? Without 
any (oo) symbol, to indicate that the doomed particle never gets out to flat spacetime, and 
SO doomed Can be either positive or negative. Indeed, since g,,,6"E" = g,, = —(1- 2) >0 
for r <rg,, in the ergoregion, the Killing vector €, is spacelike, and €goomeq is actually a 


momentum rather than an energy. From (36), we have 
Eout(OO) = Ein(0O) — Edoomed (37) 


and so we conclude that indeed it is possible, with egoomeq < 0, to have Eoy;(0o) > Ein (00), 
that is, to get out more energy than we put in. The black hole would end up losing some of 
its mass M in the deal. Note that, in contrast® to the Hawking radiation discussed in the 
preceding chapter, the Penrose process is completely classical. 

Thus, at least in principle,’ we could solve!” both the world’s energy crisis and garbage 
problem, haha. 


Angular momentum loss 


Confusio lights up: “Wonderful! But what about the other Killing vector &?” 

Excellent question! Confusio is getting smarter by the day. Consider an observer inside 
the stationary limit surface. Let his 4-velocity be given by U“ = U°(1, 0, 0, Qobserver): 
In other words, for this observer, fe = U° and ae = U Qobserver» The angular velocity 
Qobserver Must be positive, since like everybody else, the observer has to move in the 
direction of rotation, as explained earlier. We now use basic linear algebra to write U“ = 
U°(E, + Qobserver“))- It is worth emphasizing that the observer is not necessarily moving 
along a geodesic; he is certainly free to purchase a rocket pack and attach it to his back. 

What is the energy of the doomed particle as seen by this observer? As usual, this is 


given by 
0 
—U - Pdoomed = -U°E, * Pdoomed + QobserverSi . Pdoomed) =U (Edoomed = observer doomed) (38) 


where Lgoomed = +41 * Pdoomed iS the angular momentum (with the plus sign, as explained 
earlier) of the doomed particle. But the energy of the doomed particle as measured by this 
observer must be positive, and hence égoomed = observer/doomed» Which, since Qopserver 18 
positive, can be written as €goomea/ observer = Ldoomed: 

In the Penrose process, &goomed iS negative, and so Lagomeq Must also be negative. By an- 
gular momentum conservation, as the doomed particle falls into the black hole, it reduces 
the black hole’s angular momentum. In other words, the mass and angular momentum 
of the black hole change by 6M = Egoomeq < 0 and 6J = Lagomed < 0, respectively. 

Not only does the black hole lose mass in the deal, but it also loses angular momentum 
by an amount 6J satisfying 

56M > 8J (39) 
observer 
By extracting energy from a Kerr black hole, we also decrease its angular momentum and 

hence reduce it eventually to a Schwarzschild black hole. 
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Itis important to note that even if the process is not Penrose, namely if ¢goomed iS positive, 
we still have 5 aM 


observer 


> 6J, with 6M positive in this case. 


Area theorem 


Recall the area A = 82 (M* + /M* — J?) of the black hole from (32). Ina Penrose process, 
both M and J decrease. How does this area change? Varying, we find 


oA 
8x M4 — Jj2 M4 — Jj2 


2M (M2+/M4 — J2) 5M — JéJ 
ge a ae fata ie 


where we used (33) in the last step. 

But since the inequality (39) holds for any observer, it also holds for an observer hovering 
just outside the horizon, an observer whose angular velocity, as we saw in the discussion 
leading up to (33), is equal to Q;,. Inserting oe > 5J (which, as we had emphasized, holds 


regardless of whether the process is Penrose or not) into (40), we conclude that 
sA>0 (41) 


Remarkably, the area always increases! This result is often stated as the second law of 
black hole thermodynamics: no classical process can decrease the area of a black hole. As 
mentioned in the preceding chapter, results like this inspired Bekenstein to conjecture 
that the surface area of black holes should be associated with an entropy. 


Appendix 1: First and second laws of black hole thermodynamics 


Let’s now follow a particle falling through the horizon of a Kerr black hole. Return to (14), which we rewrite here 
in a slightly more concise form for convenience: 


Dp2 kK 
E=—p,=+op,+ |—*4 (42) 
Sop — $9 


with K = Dm?(g,,(4£)" + 8op( 2)" + 1). Since p, is conserved along the particle’s geodesic, we can calculate it 


at the instant the particle crosses the horizon, where D = A sin? 6 vanishes. Things simplify enormously! Do 
calculate before reading on. 


Dear reader, if you drop the square root in (42), you would have made a hasty error. You should have checked 
if anything is blowing up as D — 0. Indeed, as I noted after (35), g,. = p?/A — 00, such that Dg,, > lon sin? @. 


In contrast, nothing much happens to (see (23), for example) ggg = p? and So = = sin? 6. Hence, K > 


4 
Bogend dr )2 K Py dr )2 Hi ; ; 
pi, sin’ O(m F)4, SO that aro ras qe: Thus, evaluating (42) on the horizon, we determine the energy 


2 2 
of the infalling particle to be E = QyL 4 ( = m ( a ) ) with L = p,. The mass and angular momentum of 
+ 
the black hole change according to (recall (9)) 
re +a’ cos? @ 


2 2 
rata 


dr 


6M = Q,6J 4 | 
z dt !+ 


m| (43) 
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You of course recall that we proved the area theorem 5A > 0 by inserting the key inequality pe > dJ into 


(40). But now we learn more. We have identified what caused the area theorem to be an inequality rather than 
an equality: radial movement. 

This result also serves to convince us of the physically motivated conclusion we reached in the text, that the 
angular momentum |/| cannot exceed M. For an extremal black hole with J = M?, we see from (33) that 
the angular velocity at the horizon Q,, = —_—_—1——— evaluates to Qy = 54. Let us try to crank up J of 

g ty Sealer er, TNE | TET H>= iM ry Pp 


an extremal black hole past M2. But we just showed that 5M > Qy,6J = a that is, 5M? > 8J. Try as we may, 
we can’t get J to exceed M?. 
We have already stated the second law of black hole thermodynamics. This conclusion allows us to formulate 
also the first law. Write (40) as 
Mt J? 


5M = 8A 4+ Qy5F 44) 
167 M(M2+/M*— J2) - 


/M4—J2 
2M(M2+4/ M4—J2) 


The coefficient of g- 8A in 5M is known as surface gravity and is denoted by «. Thus, « = for 


the Kerr black hole (and « = (4M)~! for a Schwarzschild black hole). 

If we make the correspondence E <> M, T + «/2z,and S = A/4 (as in chapter VII.3), then (44) has the same 
form as the usual first law of thermodynamics dE = TdS + dW =TdS — PdvV relating the change in energy 
dE of a system to the work dW done. We could have used (33) to eliminate Qj, in (44), but the form as written 
serves to show that the angular velocity Q;, is dual to the angular momentum J in the same way that pressure 
P is dual to the volume V. The work done on the black hole by the infalling particle is Qy5J. 

In thermodynamics, a process is reversible if dS = 0. Here too, a process with 5A = 0 is said to be reversible. 
Setting 5A = 0 in (44) gives us a differential equation for M(J), which we can integrate to obtain 


2M; = M*+V M4 — J? (45) 


with M, an integration constant. Physically, we can use the Penrose process to decrease both M and J, taking 


dr 
dt 


care to ensure that in (43), = 0. In particular, we can start with a Kerr black hole, let J + 0, and end up 


with a Schwarzschild black hole with mass M,. Conversely, we can start with a Schwarzschild black hole with 
mass M, and crank up the angular momentum until we get an extremal black hole of mass 2M). 


Appendix 2: The Weyl approach to the Kerr black hole 


In chapter VI.3 I mentioned in an appendix Weyl’s short-cut derivation of the Schwarzschild solution by 
plugging the Schwarzschild metric directly into the Einstein-Hilbert action and varying, a procedure justified 
mathematically only decades later. Following Weyl, Deser and Franklin! proposed plugging ds* = g,,dt®? + 
g,,dr? + gogd0? 4 Spy? + 2g,,dtdg into the action and varying. As you may recognize, even with the short- 
cut, the situation is enormously more complicated than for the Schwarzschild case: we end up with five coupled 
partial differential equations for the five functions g,,, 8,,, 899+ 89u» 8p Of r and @. Deser and Franklin were able 
to make further progress only by using symmetry and gauge arguments to restrict these five functions. 


Appendix 3: Rotating black holes are powerful sources of radiation 


We finally come to our stated goal of understanding why rotating black holes are such powerful sources of 
radiation. In chapter VII.1, we calculated the amount of energy radiated by a particle in the accretion disk around 
a Schwarzschild black hole as it falls in. Here we will do the analogous calculation for a Kerr black hole. 

In chapters V.4, VI.3, and VII.1, the motion of particles, massive or massless, was worked out in Schwarzschild 
spacetime. So by now you should be able to work things out for the Kerr spacetime,” but as you might suppose, 
the computations become considerably more involved. We now have only cylindrical, not spherical, symmetry, so 
that for the motion of a particle, only the component of its angular momentum along the direction of rotation is 
conserved. Thus, in general, the motion will not be confined toa plane, so that the orbits may be quite complicated. 
The exception is for motion entirely within the equatorial plane defined by 9 = 7/2: then angular momentum 
conservation guarantees that the particle will stay in the plane. 
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Confusio looks puzzled for a moment, “But in chapter I.1, and in chapters V.4, VI.3, and VII.1, for both the 
Newtonian and the Schwarzschild problems, we also kept the particle in the equatorial plane.” 

Yes, but the difference is that in those cases, we could do that with no loss of generality, while here the 
equatorial plane is singled out as special. 

So set 9 = 2/2 and proceed as in chapter V.4. Compared to (V.4.11-12), the conservation laws are now 
necessarily more complicated due to the nondiagonal term g,, in the metric. Write (4) and (5) as 


( Pas | (46) 
Bor Sop) \P U 

where I have used the shorthand f = a and g= oe As we have discussed almost ad nauseum since chapter 
11.2, together with two conservation laws, we also have what amounts to the definition of proper time, which 
after setting 6 = 0 reads g,,i* 4 28:1? 4 Soy?" + g,,7* = —1. Calling the matrix in (46) G, we can write the first 
three terms in this equation as 


: i —e —€ 
Gi @)G i) =H-enG Geer" ( ) =e pe" ( ) = (det G)~! (oye? + 2g,p¢l + sul”) 
But we have already evaluated det G = g7:899 — oy Proceeding thus, we arrive at 


a ry P+ar(l—-e*) rd —ae)? 
r t 
r r2 r3 


+1=0 (47) 


Of course, as an immediate check, we can verify that for a = 0, we recover (VII.1.1) for the Schwarzschild black 
hole. Remarkably, even though the Kerr metric is so much more complicated than the Schwarzschild metric, for 
this special equatorial plane case, the effective Newtonian potential still consists of a1/r,a1/r?, anda 1/r? term. 

Confusio is quick to point out that, strictly speaking, this is no longer a standard Newtonian mechanics 
problem, since the potential also depends on the effective energy e* — 1. 

But Confusio, this is no objection at all. We are merely using Newtonian mechanics as a pedagogical aid 
in solving an ordinary differential equation. In fact, let’s write (47) as 724 -V(rsl,a,rs, €) = 0 and think of a 
Newtonian particle with zero total energy moving in a potential V(r) that depends on a bunch of parameters /, 
a,rs,ande. 

From this point on, the physics is conceptually the same as in the Schwarzschild case in chapter VII.1, and I 
urge you to review the steps there. Physically, as before, a particle in the accretion disk, starting from far away, 
crashes through the other particles in the disk and eventually elbows its way into the innermost stable circular 
orbit, heating up the accretion disk and radiating away energy in the process. We thus have to find the radius 
risco Of the innermost stable circular orbit and evaluate V (r}sco) to find out what fraction of the rest mass of the 
particle was lost to radiation. 

For clarity, break the calculation into 3 steps. 


1. Asin chapter VII.1, we first solve dV(r;1, a, rs, €)/dr = 0 to determine 
Tmin@,@,1s,€) and Pmax(l, a, rs, ©) 


the locations of the minimum and maximum of the potential, respectively. Since, as we already 
observed, V consists of a 1/r, a 1/ r2, anda 1/ r> term, this step only requires solving a quadratic 
equation. 


2. The term “innermost” in the astrophysicist’s acronym ISCO refers to rin (/, a4, rs, €) decreasing until 
the minimum of the potential disappears at a critical value / = 1, =1,(a, rg, €) determined by solving 
Tmin(, @, 1s, ©) =Tmax(l, a, rg, €). Remarkably, after all the talk about stationary limit surface and frame 
dragging, the astrophysically relevant calculation involves only high school level algebra (but very messy 
algebral). 


3. The third and final step requires solving V (rjgco3/¢, a, rs, €) = 0. Since rjgco has been determined in 
terms of (/,, a, rs, €), and /, in terms of (a, rs, €), this equation determines the dimensionless number 
€(a, rs) in terms of the two lengths (a, rs); hence € depends on the ratio a/rs. As explained in chapter 
VII.1, the fraction of energy lost to radiation is then given by 1 — €(a/rs). 


You are invited to carry out this calculation (perhaps numerically), which I have so kindly outlined for you. 
You will discover that the particle, just as the light ray studied earlier in this chapter, can be corotating or 
counterrotating. Intuition probably tells you that by corotating, you can get in closer to the black hole and hence 
lose more energy. You could make up some nice plots of how the fraction of energy lost 1— €(a/rg) varies as a 
varies. 
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As for me, I am content to work out the extremal case a =rs/2 = M, for which the expressions involved 
simplify quite a bit. Going through the steps outlined here, we find readily that* the fraction of energy lost is 
given by 1— 7 ~ 0.42, a whopping 42% compared to the 6% for a Schwarzschild black hole, and the pitiful 


0.7% for the thermonuclear processes that power the stars! 


Exercises 


1 Consider the terms a(r, 0)dr? + 2c(r, @)drdé + b(r, 0)d6? in ds? in (1). Show that by redefining r = f (7, 0), 
you can get rid of the cross term. 


2 Evaluate the Kerr metric in the limit r > oo. 
3. Evaluate the Kerr metric in the limit a > 0. 


4 Write g,,= D? sin? 6/p2. Show that X? = (r? + a”)? — Aa? sin? @ and that the Kerr metric can be written 
in the form 


A — a? sin? 8 i 2rsar sin? 6 


2 
ds 7 mi 


p y2 
dtdy + —dr? + p2dé? 4 sin? Ody? (48) 
A p2 


5 Show that the Kerr metric can be written in the form (23). 
6 Plot w(r, @) as a function of r for various values of 6 and a. 


7. Using rg as the unit of length, show that 


732.0 = 7/2) (s xfer tx D) /(x8 44% +0) 


with x =r/rs and g =a/rs. Note that 


Xs, =1s.(0 =2/2)/rs = 1and x, =r, =2/2)/rs 3 (1+ v1 a) 


Plot r>Q4.(0 = 1/2). 


8 Show that in the equatorial plane, light rays follow 


1 fdr\? 1 a : a 2 bs 1 
p (<) +5 ( 2 (1 sign() | 2 (49) 


with as usual the impact parameter b = L, Compare with the corresponding Schwarzschild potential 


5 (1 a ‘s) in (VII.1.12), which we can recover from the potential here by setting a = 0. An interesting new 


feature here is the appearance of the sign function sign(/), reminding us that corotating and counterrotating 
light behave differently. 


9 For completeness, I write Kerr’s two original forms" here: (I) 


2 
ds? = (1 5) (du Fa sin’ ede) 


+2 (du +asin2 ede) (ar +asin? ede) +p? (a6? + sin? ede”) (50) 


* Alsol, = a and risco = 5 = M. 
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with p* = r* + a” cos? @ as in the text, and (II) 


3 dx+ydy) a(ydx—xdy) z 2 
GP = SPege dy ae ho eg 51 
: - * F r4 + a2z2 r2+ a2 r2+ a2 Pes Pt) 
with the function r(x, y, z) defined by 
roy? 22 
r? + a? tear (52) 


Find the coordinate transformations that bring these into the Boyer-Lindquist form (15). 


The Kerr-Schild form is obtained by noting that in (50), the dependence on the mass M of the black hole can 
be explicitly split off by writing the metric as g,,, = ce + ard ly, with #5 independent of M and with the 
vector /,, = (1, 0, 0,4 sin’ 6) in the basis x“ = (u,r, 6, y). Show that 1, is null (that is, lightlike) with respect 


to both g,,,, and Bic 


11 According to the preceding exercise, the Kerr metric in the limit M > 0 should give the Minkowski metric 
heavily disguised as eae Calculate the Riemann curvature tensor for ae 

Notes 

1. Frame dragging: Consider a number of related slang expressions in various unrelated languages, for example 
“draguer” in French (originally, to fish with a drag net), and “to drag or pull a girl along” in Cantonese. 

2. A deep and involved song and dance in quantum field theory shows that the negative root, when correctly 
interpreted, leads to the existence of antimatter. See QFT Nut or any other quantum field theory text. 

3. As expressed in Boyer-Lindquist coordinates. Kerr originally wrote the solution in another form. See exer- 
cise 9. 

4. For an interesting discussion of the history, see D. L. Wiltshire, M. Visser, and S. M. Scott, eds., The Kerr 
Spacetime: Rotating Black Holes in General Relativity, Cambridge University Press, 2009. 

5. Fora first-person account of the events leading up to the Kerr solution (and how the young Kerr was allowed 
only 10 minutes at the conference where he presented his solution), see G. Dautcourt, “Race for the Kerr 
Field,” arXiv:0807.3473. According to this author, the construction of the Berlin Wall affected the race. 

6. For a detailed and rather technical analysis, see N. Straumann, General Relativity, pp. 432ff. 

7. In 1968, B. Carter showed how, with various assumptions (such as separability), one could obtain the Kerr 
metric. 

8. There is, however, a formal similarity, with the role of the infalling particle in the Penrose process played by 
a vacuum fluctuation in the Hawking process. 

9. The interested reader can find a drawing in C. W. Misner, K. S. Thorne, and J. A. Wheeler, Gravitation, 
showing an advanced civilization, having erected a spherical metal framework around a Kerr black hole, 
dumping its garbage through the horizon. 

10. Of course, if we were able to get ourselves near a Kerr black hole, we might as well navigate close to any old 
sunlike star. 

11. S. Deser and J. Franklin, arXiv: 1002.1066 (2010). 

12. I should mention that for rotating spacetimes the analog of the Newton-Jebsen-Birkhoff theorem does not 
exist. Outside a rotating star or planet that is not a black hole, the spacetime does not have to be the Kerr 
spacetime; it merely has to approach the Kerr spacetime far away. (In chapter IX.4, we will see that the 
spacetime outside a mass distribution could be given as a multipole expansion in terms of the T”” of the 
mass distribution. Within some constraints, you have the freedom to arrange T“” and hence modify the 
spacetime outside. Far away, however, the higher multipoles fall away, and the spacetime must approach 
Kerr asymptotically.) 

13. For a useful list of relevant results for the Kerr black hole, see the article by M. Visser in Wiltshire et al. ibid. 
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Black holes with electric charge 


Reissner in 1916 and Nordstrém in 1918 discovered independently a spacetime with the 
same setup as in the Schwarzschild solution, except that the central mass carries an electric 
charge* Q. Since it is difficult? to imagine an astrophysical object with a large electric 
charge, the solution is of theoretical, rather than practical, interest. 

As is the case for Schwarzschild spacetime, the metric has the form ds* = —A(r)dt? + 
B(r)dr2 + r2dQ? with A(r) and B(r) to be determined. In addition to Einstein’s equation, 
we now also have to solve Maxwell's equation D,, F“” = Fe ul/-8 FH’) = 0. Spheri- 
cal symmetry implies that the electric field has only a radial component, which we will 
call E = Fo, = —F,o, with the identification in terms of the field strength given in chap- 
ter VI.4. Thus, F = gg" Fy, = E/(AB). Also, g = —ABr* sin” 6. Maxwell’s equation 
thus reduces to 0,(r2E // AB) = 0, with the solution 


_ OVAB 


pale (1 


r 


The electric charge Q is defined by the boundary condition E(r) > Q/r? as r > ov. Since 
the energy density contained in the electric field dies off rapidly, we expect spacetime to 
be asymptotically flat, that is, A(r) > 1, B(r) > 1, as r > oo. With Fo, = —F, the only 
nonzero components of the field strength F,,,,, the other Maxwell's equation e?*"" D, F,,, = 
0 is trivially satisfied. 

Next, we have to solve Einstein’s equation, now with a nonvanishing T,,,, from the energy 
momentum contained in the electromagnetic field. From chapter VI.4, we have T,,,, = 


FF — tg Fy pf °°. Recall that the energy momentum tensor of the electromagnetic 


* The Kerr solution also can be endowed with an electric charge, in which case it is known as the Kerr-Newman 
solution. 

t Since any such object, if, say, positively charged, would attract electrons from its environment and repel 
protons and so quickly neutralize its charge. 
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field is traceless, so that Einstein’s equation (VI.5.10) reduces to 
Ryy =82GT,, (2) 


where the Ricci tensor R,,, was computed in chapter VI.3. As before, (2) contains three 
equations, for ~v = 00, rr, and 60, but due to the Bianchi identity, only two of these 
equations are independent and serve to determine the two unknown functions A and B. 

Let us now compute T9 = g”” Fo, For — ¢800(2For For)8°0g"” = (E?/B) + 4(-2E”)/B= 
E*/(2B). Similarly, we find T,,, = —E?/(2A) and Tyg = r*E?/(2AB). Thus, BT) + AT, = 
0, and so B Rog + AR, = 0. At this point, either we look up the expression for Rog and R,., in 
chapter VI.3, or we remember, if we are endowed with a great memory, that B Rog + AR,., 
is proportional to the combination ( + BY. Either way, we find (4 + 5’) = 0 with the 
instant solution AB = 1, just as in the Schwarzschild solution. 

Things now simplify: in particular, Tyg = Q?/(2r7). Putting this into the wv = 00 equa- 
tion in (2), looking up Rgg in chapter VI.3, and eliminating B = 1/A, we obtain A +r A’ = 
(rA)' = 1— (41GQ?/r?) with the solution 

2GM | 4nGQ? 


A(r) =1 5 (3) 
r r 


Also, as in the Schwarzschild case, the total mass M appears as an integration constant 
fixed by the boundary condition as r > oo. 

You might have thought that solving the coupled Einstein and Maxwell equations would 
be rather difficult, but thanks to spherical symmetry, the solution pops out easily and has 
a remarkably simple form.! 


Spacetime structure of the charged black hole 


We now regard the Reissner-Nordstrém spacetime as a charged black hole (rather than a 
charged star). I will merely sketch some salient features of the spacetime, referring you to 
more specialized treatments. Indeed, the rest of this chapter may be omitted upon a first 
reading and is not needed for the rest of this book. For the sake of clarity, let us use units 
in which G = 1 and absorb the 47 in (3) into the definition of Q, so that we write 


2M _ Q? 1 
ds? = — (: at £) dt’ + | —_ | dr? + r’an? (4) 
r r f=24 a 2 


r r2 


Consider the function A(r) = —go9(r) = (1— 2% + 2) =r — ry) — r_)/r? with 


rg =M+/M2- Q? (5) 


Evidently, charged black holes fall into three categories: (a) subextremal, with Q < M; 
(b) extremal, with Q = M; and (c) transextremal or “naked,” with Q > M. The terminology 
will become clear shortly. 
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Subextremal black hole 


For Q = 0,r, =2M and we recover the Schwarzschild black hole, of course. As r decreases 
from infinity, the function A(r) = 1— 2M decreases from 1 to —oo, crossing zero at the 
Schwarzschild radius 2M. But as soon as we crank up Q, the +Q*/r? electric term in (3) 
takes over for small r, arresting the plunging A(r) and pulling it back up to infinity. (Plot 
this!) The curious feature is that while t becomes a spacelike coordinate for r_<r<r,, 
it goes back to being a timelike coordinate again once we get below r_. Indeed, in sharp 
contrast to the Schwarzschild black hole, the physical singularity at r = 0 is timelike. In 
other words, for small r, ds? > ~Qar + ear + 7r7dQ?. 

As we crank up Q further, at Q = M, the function A(r) just barely touches the r-axis and 
the two roots rs. merge, with r, =r_ = M. This is known as an extremal black hole. We 


will come back to it later. For now, let’s ask what happens if we crank up Q even further. 


Naked singularity and cosmic censorship 


For Q > M, the roots r; disappear. The function A(r) = —gg9(r) does not vanish. It goes 


from 1 at r= oo to oo at r=0, staying positive the whole time. Similarly, 1/A(r) = 
—o9(r) stays positive. Thus, in contrast to the subextremal black hole (which includes 
the Schwarzschild black hole), t and r are perfectly respectable timelike and spacelike 
coordinates, respectively, with a spacetime described by the Penrose diagram in figure 1. 

Recall that for the Schwarzschild black hole, because of the horizon, signals from the 
vicinity of the physical singularity at r = 0 cannot get out to an observer stationed at 
r = oo. Observers outside the horizon cannot see the singularity. General relativists rather 
picturesquely say that the singularity is clothed by the horizon. 

In contrast, here we have what is known as a naked singularity, visible to the outside 
world. Signals from the vicinity of the physical singularity at r = 0 can get out to r = oo. 

There was a long history of hand-wringing over the appearance of naked singularity in 
classical general relativity, culminating in the cosmic censorship conjecture. The conjec- 
ture states that for reasonable initial conditions, a naked singularity cannot form. This 
does not mean that Einstein’s field equation does not allow naked singularities: verily, the 
Reissner-Nordstrém black hole for Q > M offers an example. But it is an eternal black 
hole just sitting there; the conjecture addresses the issue of whether it could have formed. 
I direct the interested reader to the vast literature devoted to the conjecture. 

You may catch some flavor of the conjecture by noting that, in the Reissner-Nordstr6m 
example, M governs how the metric approaches Minkowskian as r > oo and thus by 
definition is the total mass of the black hole. Because the electric force is repulsive, we 
expect the electric field to contribute positively to M. In contrast, the gravitational force 
is attractive, so we expect that it will contribute negatively. The transextremal condition 
Q > M says that the black hole is not very massive for its charge. This implies a large 
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timelike trajectories 


Figure 1 Penrose diagram for a charged black hole. 


negative gravitational contribution needed to cancel the positive electric energy. At issue 
is whether the needed negative contribution is physically reasonable. 

Although the conjecture sounds plausible, it has never been proved. The proof of the cos- 
mic censorship conjecture constitutes a difficult mathematical challenge, but regardless, 
the presence of naked singularities indicating a breakdown in our conception of space- 
time is surely a problem for classical general relativity. Opinions differ on whether naked 
singularities pose a problem for theoretical physics. Most people believe that quantum 
gravity would “smooth out” a naked singularity; indeed, even our classical conception of 
spacetime may disappear in whatever theory of quantum gravity we end up with. 

One subject we do not go into in this book consists of the various rigorous singularity 
theorems? telling us in general under what circumstances various types of singularities 


3 


can and cannot occur. These theorems” are typically proved by assuming that the metric 


satisfies Einstein’s field equation with a physically reasonable* T””. 


* For example, that it must satisfy the strong energy condition to be discussed in appendix 3 of chapter IX.3. 
P y g sy PP P 
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Extremal black holes neither attract nor repel 


Let’s return to extremal black holes.* You might have noticed that the extremal condition 
Q* = M? suggests that the gravitational attraction —M?/r* and the electric repulsion 
Q?/r? between two extremal Reissner-Nordstrém black holes balance each other. (Recall 
that we have set G = 1 and absorbed a factor of 47.) That there is no net force between 
two extremal black holes indicates that we could put a bunch of these objects down at 
arbitrary locations and they would just sit there. We could not do this with Newtonian 
masses: they would fall toward one another. Nor could we do this with Coulomb charges. 
But with extremal black holes, yes! Newton balances Coulomb. 

This physical intuition, if true, implies an amazing mathematical fact: there must 
exist an entire family of static solutions of the coupled Einstein and Maxwell equations 
describing a bunch of extremal black holes just sitting there. As you will see in appendix 1, 
it is not entirely trivial to find these solutions, but without the physical motivation, the 
solutions would appear to be miraculous.’ 

Extremal black holes are in some sense exceptional objects,° poised on the dividing 
line between subextremal black holes and naked abominations, perhaps reminiscent of a 
pencil balanced on its tip at the very edge of a table. 


No-hair theorems 


In my entire scientific life . . . the most shattering experience 
has been the realization that an exact solution of Einstein’s 
equations of general relativity, discovered by . . . Kerr, provides 
the absolutely exact representation of untold numbers of massive 
black holes that populate the universe. 


—S. Chandrasekhar 


When you read the preceding chapter on the Kerr black hole, were you as shattered as 
Chandrasekhar was? What, you are still in one piece? 

Consider two massive stars circling each other. To characterize the system completely, 
we would have to give the mass and size of each of the stars, the chemical composition, 
the temperature, the orbital parameters, and on and on—you get the idea. Eventually, they 
approach each other, and after radiating some electromagnetic and gravitational waves, 
form a rotating black hole. Now only two numbers, the mass M and angular momentum 
J, suffice to characterize the system. 

Amazingly, spacetime, in swallowing up matter and curling over to form a black hole, 
manages to obliterate almost all the evidence, as it were. The Schwarzschild black hole is 
characterized completely and exactly by its mass M; the Reissner-Nordstrém black hole 
by M and Q; the Kerr black hole by M and J; and the Kerr-Newman black hole by M, J, 
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and Q. (We omit mentioning magnetic charge here; see exercise 3.) In practical terms, if 
you were the commander of a spaceship approaching a planet, your second-in-command 
might have to give you a massive computer file listing the location and height of every 
mountain and so forth, but if you were approaching a black hole, only a tiny slip of paper 
with a couple of numbers written on it would suffice. 

So, what are so special about M, J, and Q? The answer should be clear to you: they 
couple to the two infinite ranged fields we know about, namely the gravitational and the 
electromagnetic field. All the other physical quantities that went into the making of the 
black hole are subsequently hidden behind the horizon. Wheeler has summarized this 
state of affairs by quipping that “Black holes have no hair.” 


Appendix 1: Extremal black holes just sitting there 


Let us now verify the physical argument that there must exist an entire family of static solutions of the coupled 
Einstein and Maxwell equations describing a bunch of extremal black holes “just sitting there.” Start with a single 


. 2 
extremal black hole. With Q = M, the metric in (4) becomes ds? = (1 i) dt? 4 iF dr? + r7dQ?. 


Tr 


Spherical coordinates are dandy for one black hole, but not so good when we have a whole bunch of them. 
So, exploit our freedom to change coordinates and set r = p + M. A few lines of arithmetic lead us to ds*= 
—f (p)~*dt? + f(p)?(dp* + p7d2?), with f(p)=1+ . We next introduce Cartesian coordinates by setting 
dp? + p*dQ? = dx? + dy* + dz’. 

Now consider the Ansatz 


ds? =— f(x, y, z)~7dt? + f(x, y, z)2(dx? + dy* +. dz’) (6) 


with f(x, y, z) some unknown function of x, y, z, not necessarily f(o). We are supposed to plug this into 
Einstein’s equation. 

We also need an Ansatz for the electromagnetic field. For a single extremal black hole, the only nonvanishing 
component of A,, (the factor of 47 in the following expression comes from our scaling Q so that extremal means 
Q = M) isthe time component Ag = Q/(/42r) = M/(/4(p + M)) = (1— f (0) 7/41, with f (p) = 14+ oa 


Our inspired guess is to set the only nonvanishing component of A,, to be 


1 a 
Ao(t, x, ¥,2)= = (1- f(y, 271) 7 
ot, x,y Tae fay (7) 
with the same unknown function f(x, y, z) as in (6). 

A priori, it would seem hopeless that the Ansatz (6) and (7) with a single time independent function f(x, y, z) 
could solve the numerous coupled Einstein and Maxwell equations, but as I said, we are buoyed by our faith in 
our physical picture. Start with Fo; = —8; Ag x (3; f)/f* (where evidently, 3, = 4 and (x1, x?, x3) = (x, y, z)). 
Since g = — f*, Maxwell’s equation ee (o/ gh y= Fail gF'") =0 reduces to 0;( f?Fo;) = V2 f = 0. 
The unknown function f satisfies Laplace’s equation. 

Next, to solve Einstein’s equation, we first have to compute the energy momentum tensor 


Tay = FyyFe — 48 pvFopF (8) 
which I copy here for convenience. First, note that go) = —1/f?, g11 = f?, g° =—f?, and g!!=1/f?, so that 
Fl = gMgllpy = (a, f)/f? and F,,F°? = —2(9; f)2/f* (where (3,f)? = Xo, 9:9; f). Then Tyo = Fo Fi - 
4800 F op F? = (0; f)?/(2f%). Similarly, we have T,; = {(0; f)? — 2(0,f)?}/(2f?) and of course also the corre- 
sponding expressions for T) and T33. Clearly, T; = 0, but T,) = —(0,f)(8)f)/f? does not vanish. A simple 
check on the arithmetic is that the trace T = g””T,,, vanishes. 

Onward to Einstein’s equation R,,, = 87T,,,. After some work, we find Roo = {0 f)? — fV7AV/F>, Ruy = 


{(8; f)* — 20a, f)? — fV7F}/F2, and Ry = —2(8,f)(a,f)/f?. Plugging in, we find, remarkably enough, that 
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Einstein’s equation collapses to v2 f(x, y, z) =0, in agreement with what Maxwell’s equation demands. As I 
explained, it is amazing, but perhaps a bit less amazing, given our physical motivation. 
Well, we all know how to solve Laplace’s equation in empty 3-dimensional space. The general solution is 


Mg 


|x — Xql 


N 
f®=1+)> (9) 
a=1 

with the additive constant chosen so that Ay > 0 as ¥ > oo. As promised, we have found a time independent 
solution of the coupled Einstein-Maxwell equations describing N extremal black holes with arbitrary mass 
M, = Q, Sitting at arbitrary locations x,. They are not moving, because the force between any pair of black 
holes vanishes! Note that (9) implies a highly nontrivial electric field E. 

There are no doubt more elegant ways to arrive at the solution, but in an introductory text, I prefer an explicit 
calculation. The discussion provides an example of physical intuition leading to a mathematical result that would 
otherwise be unsuspected. 


Appendix 2: The interior of a subextremal Reissner-Nordstrém black hole 


I sketch here what happens to an observer falling into a subextremal Reissner-Nordstrom black hole with 
Q* < M’. (Take the observer to be electrically neutral, so that there is no Lorentz force acting on him.) As he 
falls past the horizon at r,, the physics is much the same as experienced by an observer falling past rs in a 
Schwarzschild black hole. Indeed, for r Sr, we have —go9(7) = (F¥ — y(n —r_)/r? XH yy — rir, 
which as expected, is close to the corresponding Schwarzschild expression (r — r,)/r4 forr, >r_. 

The key physics is that r has turned itself into a time coordinate, and the observer is obliged to keep moving 
in the direction of decreasing r toward the physical singularity r = 0. But unlike the Schwarzschild case, the 
observer falling into a subextremal Reissner-Nordstrom black hole is not doomed: once he gets past r_, the r 
coordinate turns back into a space coordinate! In the region r < r_, the observer can move however he wants. 
He could lazily fall in toward r = 0, but he could also move in the direction of increasing r by firing his rocket. 
Once he gets past r_, the r coordinate turns into a time coordinate again, but this time, since he was moving in 
the direction of increasing r, he is obliged to keep on moving in the same direction. Eventually, he zooms past 
r, and escapes from the black hole. 

Just as in the Schwarzschild case, where we saw that (t, r) are not good coordinates to use, to make sense 
of the story here, we also have to work and replace (t, r) by more sensible coordinates so as to eventually arrive 
at the analog of the Kruskal extension of the Schwarzschild metric. We won’t go through that here, but have 
shown the result in the form of a diagram (see figure 1), the analog of figure VII.2.7. In the Schwarzschild case, 
the physical regions I and II are extended to regions III and IV. Here the physical regions we started out with 
are repeated indefinitely. As indicated in the figure, our intrepid observer, when he zooms past r,, will actually 
enter a different asymptotically flat spacetime than the one he started out from. 

If you feel that the eternal Schwarzschild black hole with its Kruskal extension is more of a mathematical 
construct than a physical entity, you would feel even more strongly about the subextremal Reissner-Nordstrém 
black hole with its extension. 


Exercises 


1 Show that a photon moving in a radial direction in a subextremal Reissner-Nordstrom black hole follows a 


path determined by a =4 Integrate this equation. 


te aad 
“Grr” 


2 Show that by defining df = dt + (A(r)~! — 1)dr (in complete analogy to what we did in chapter VII.2), we 
can write the Reissner-Nordstrém metric in the form 


ds? = —Ad?? + 2(1— A)didr + (2— A)dr? +r°dQ (10) 


3 Finda solution describing a black hole endowed with a magnetic charge. In fact, you can write a solution 
with both electric and magnetic charges. This merely reflects what is known as electromagnetic duality. 
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Notes 


Ww 


2GM 


. Here is a handwaving understanding of the solution. Since A(r) > 1— “2 asr — oo, M represents the total 


r 
mass, including the electromagnetic contribution. As r decreases, by Newton’s second superb theorem (see 


2 2 
chapter I.1), we should subtract off the electromagnetic contribution ~ 4a f° drr?5 a = ox 


and define, 
4nGQ? 
ye’ 


heuristically, an effective mass M(ir)~M— ct We then guess that A(r) ~ 1 se =1 26M } 


. Proved by B. Carter, G. F. R. Ellis, S. Hawking, R. Penrose, and many others. I refer you to more advanced 


monographs by some of these authors. 


. Just to give you a flavor of this kind of theorem, let me mention that one theorem states that, with the 


strong energy condition, if a metric has a trapped surface from which light rays cannot escape, then either 
a singularity or a closed timelike curve is present. 


. They have also played an important role in string theory. 
. This is why I give you the physics first, unlike some other authors. 
. The interested reader is referred to the literature, which can be traced from E. Poisson and W. Israel, Phys. 


Rev. D 41 (1990), p. 1796, and D. Marolf, arXiv:1005.2999. Poisson and Israel found an instability associated 
with the inner horizon of close-to-extremal but still subextremal black holes. Marolf, referring to this as the 
“dangers of extremes,” suggests that the more remarkable features of the interior spacetime of extremal 
black holes would in fact not survive any quantum fluctuation. 


Recap to Part VII 


Surprisingly, that innocuous gravitational action that leapt out at us is capable of altering 
the causal structure of spacetime. 

The 18th century fantasy of Michell and Laplace is realized by a global alteration of 
spacetime. Even more amazingly, when quantum fluctuations are turned on, a hapless 
member of a fluctuating pair could fall through the horizon, allowing its partner to escape 
to infinity. 

A rotating black hole can drag spacetime around with it and can convert mass into energy 
at an efficiency almost 100 times higher than that of nuclear processes. 

Charged and extremal black holes are fun objects to play around with, but perhaps more 
importantly, the relativistic equation for stellar interiors can also be written down without 
much fuss. 


Part VIII) Introduction to Our Universe 


VI | | l The Dynamic Universe 


The universe comes to life 


The Newtonian universe offers a rigid space for stuff to move in, but as we already saw 
in chapter VI.2, the Einsteinian universe enjoys a life of its own, bending and curving, 
reacting to the stuff that fills it. Stuff tells spacetime how to curve. The actors act back on 
the stage. 

Back in chapter VI.2 we made a mad dash to cosmology, but at that point, we didn’t 
know enough and were able to fill the universe with only dark energy, presumed to be a 
manifestation of Einstein’s cosmological constant. Later, in chapter V1.4, we learned how 
to obtain, given a matter action, the corresponding T“”. We can now plug our favorite T"” 
into the right hand side of Einstein’s field equation (VI.5.10) 


RY” = 8nGS"”" = 8nG(T"” — 58") (1) 


thus filling up the universe with one ingredient or another, and watch the universe evolve 
as dictated by (1). 
Also, in chapter VI.2, we took the universe to be described by 


ds* = —dt’ + a*(t)((dx!)? + (dx’)? + (dx) 
While spacetime is curved, space itself is flat. Let us now generalize this description to 
ds* = —dt” + a*(1)&;j(%)dx'dx! (2) 


where g;;(x) denotes a 3-dimensional metric associated with the space we live in. In other 
words, the universe is regarded as a curved space described by dl* = g;;dx'dx/, stretched 
at any given instant by the function a(t), the scale factor of the universe. For the moment, 
we leave g;; unspecified. 
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It is conventional to normalize the scale factor by setting a(ty) = 1, with t the time at 
present. Recall also from chapter V.3 that we can relate a(t) to the redshift z commonly 
used by astronomers by 


a(t) = 


1+z 0) 


(You can verify that the derivation given earlier goes through regardless of g;;.) A 
large redshift z > 0 corresponds to a time when the universe was smaller by a factor 
a(t) <1. 


The curvature of the universe 


One goal of cosmology is to find out what the universe is filled with, and as a result, how the 
universe expands. To study cosmic expansion, we have to calculate the Ricci tensor. Resort 
to our usual trick of extracting the Christoffel symbols we need from the geodesic equations 
obtained by varying {[dt* — a2(t)3;; (%)dxidx/]2, with dt? = dt? — a*(t)8;;(X)dx'dx/, as 
has been explained ad nauseum in this text starting in chapter II.2. By now you can probably 
do this with your eyes half closed. 

For example, varying with respect to x’, we write the resulting Euler-Lagrange equa- 


) 
dt 
d2xi dt dx 
=a eu ) + 2a) g, OLE +4 a*(1)8,84, (8) 


tion as 


dx! dx/ d 

1.2 ~ aX ax Dips ee 75 
=a‘(t)d,¢;; — —— = — | a‘(t)g, 

7 (t) 18ij jee It ( (1) 8); 


dx! dxi 


dt dt 
Multiplying by g”’ and cleaning up, we obtain 
d*x' Qa dt dx' =, dx! dx/ 
E =0 (5) 
dt* a dt dt dt dt 
The Christoffel symbol for the spatial metric g;; 
i = igi (8; Sie + 9:8 jx — %8:;) (6) 


emerged rather nicely, but of course from general considerations, we knew that the 
derivatives of g;; that appeared in (4) had to self-assemble appropriately. Note that the 
derivation of (5) makes clear that g'* in (6) denotes the inverse of g;;, not the /k component 
of gt”. 

Varying with respect to r yields 


OQ: 1 
Dj, = 448;;, Vo; 
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It is instructive to compare with what we had in chapter VI.2: ey =adés;;, TS; = ry = 451, 
and I! ; = for flat space (but curved spacetime!). Indeed, given these results, we could 
have almost guessed (8). 

From (8), a straightforward and not so tedious calculation then gives the following 
nonvanishing components of the Ricci tensor: 

Rog = —-—, Rij = Ry + (24? + aii) ii (9) 
where Rij is the Ricci tensor for the spatial metric g;;. Asin chapter VI.2, we can understand 
the general features of these results, notably that Rog cannot depend on g. If we set g;; to 
the flat metric, we should recover the result R;; = (2a2 + ada)65;; we had before. But if we 
set a to a constant, we must have R;; = R;j.- Hence, the result for R;; could also have been 
anticipated. 


Cosmological principle 


A working assumption of cosmology is that on scales much larger than galaxies, the uni- 
verse is homogeneous and isotropic: it has neither a special location nor a special direction. 
This so-called cosmological principle, a direct intellectual descendant of the Copernican 
principle, has been verified to remarkable accuracy observationally, notably by measure- 
ments of the cosmic microwave background.* Of course, this “perfect” cosmological prin- 
ciple may have to be modified at any moment by new and unexpected observations,’ but 
accepting it, we can then fix g;;(x). 

Intuitively, we readily understand that 3-dimensional spaces without special location 
and direction more or less have to be the Euclidean 3-space E%, the 3-sphere S$, or its 
hyperbolic cousin H?, discussed really way back in chapter I.6. (In chapter IX.6, we will 
make this expectation precise with a full-fledged discussion of isometry and maximally 
symmetric spaces.) Indeed, in chapter V.3, we already explained that our universe could 
be (spatially) closed, flat, or open, described by 


ds* = —dt? +. a(t)” dr* +r7dQ (10) 


r2 
mee 
with the integer k = 1, 0, and —1, respectively, known as a Friedmann-Lemaitre-Robertson- 
Walker universe. As was also explained there, we will be often tempted to absorb the length 
scale L into r, so that r is then dimensionless and ds? = —dt? + RO*(So dr? + rdQ), 
with R(t) = La(t). With the convention a(tg) = 1, we have L = R(t) = Ro. Do not confuse 


* We assert here that a decade or two ago, it was possible to write a text on gravity and include a more or 
less complete discussion of observational cosmology. But by now, considering that we are living in a golden 
age of cosmology, such a discussion would be either annoyingly brief or soon hopelessly out of date. Thus, I 
cannot possibly do justice to observational cosmology and have to refer you to the standard texts on cosmology 
by Dodelson, by Mukhanov, and by Weinberg. This remark applies to all the chapters in part VIII. 

¥ Periodically, there are disturbing hints! that the cosmological principle might fail. 
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R(t), which is evidently a length, with the scalar curvature R. In any case, this is becoming 
standard usage. 


Plugging the spatial metric g;; defined by dl? = g,;dx'dx/ = dr? + r7dQ into the 
formula for the Ricci tensor, we obtain, after a straightforward calculation, 
~ 2k 
ij = 12% j (11) 


Indeed, we could have anticipated that the Ricci tensor Rij would turn out to be propor- 
tional to g;;. What else could it be, for a space with no special direction and location? No 
other symmetric 2-indexed tensor is lurking around. (In chapter IX.6, we will prove that 
this property holds for any maximally symmetric space.) Furthermore, the Ricci tensor 
vanishes when k = 0. Dimensional analysis nails down the 1/L?. Thus, you could say that 
the calculation you just did (didn’t you?) is merely to get the 2 in (11). 

Thus, in the end, the message is that life is simple: the spatial components of the 
spacetime Ricci tensor (see (9) and (11)) are given by 


: .  2k\. . “ . 
R= (20 4aéi+ 4) z= (28? + RR+ 2k) 3,j/V (12) 


Note the distinction between the spatial components of the spacetime Ricci tensor R;; and 
the spatial Ricci tensor R; ;- They are of course not the same. 


Filling the universe with a perfect fluid 


Let’s fill the universe up with a perfect fluid: it deserves no less. 

We first worked out the energy momentum tensor T“” = (9 + P)U“"U" + Pn” of a 
perfect fluid in flat spacetime back in chapter III.4, and then, for our discussion of stellar 
interiors in chapter VII.4, promoted it, by invoking the equivalence principle, to the form 


T’ = (p+ P)UYU" + Pg” (13) 


appropriate for curved spacetime. We now use (13) to source the geometry of the universe. 
Let’s write this out for a diagonal metric (such as the one we have here) in the comoving 
frame (in which U vanishes). First, the normalization condition g wvU"U” = —1 reduces 
to Zo9(U°)? = —1, so that U" = (1, 0)/./—8o0- Thus, T = (—go9) (0 + P) + Pg™ = 
—p/go9 and T'! = Pg’. 
In the present context, go) = —land so T® = pandT!/ = ie in the comoving frame. 
Let’s next see how energy momentum conservation 


DT ao Fe AT Pe Te (14) 


works in our expanding universe. 

For v = 0, using the diagonal character of T“”, the assumed spatial homogeneity of the 
universe, and the list (8) of Christoffel symbols, we obtain D,, T° = d9T + PiigT? + 
r?, T'J, Write this out more explicitly. First, look up in (8) that ra = i and r?, = aag;j. 
Plugging in our result T° = p and T'/ = gi in the comoving frame, we find a = 


aa58,;8') = 3P4 and thus 
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oe 
pipe ys p(0a*) = — Papa? (15) 
a a 


It is instructive to verify an identity we derived in chapter V.6: T a — pais, 
where g = det g,,,. Recall from way back in chapter I.5 that g does not transform as a 
scalar, but that d+x./—g does and so measures volume. Physically, suppose the comoving 
observer marks out a spatial box measuring Ax!Ax?Ax?; then the volume of the box is 
actually Ax'Ax?Ax?,/—g. Here (—g) = a(t)°g, and thus the volume is proportional to 
a(t)? /det 3; j- The first factor a(t)? takes into account the expansion of the universe, the 


second the effect of spatial curvature. Now evaluate the 4 = 0 component of the identity: 


6 = 4 do. /—8 = 4 dpa? = 34 (since g does not depend on f), in agreement with what 
we had in (8). The identity works, of course; the pedagogical point, rather, is that this 
little exercise sheds light on the physical meaning of the term 34 6 in the conservation 
law above: the energy density changes partly because the volume seen by the comoving 
observer changes due to the expansion of the universe. 

Now we also understand (15): the energy density also changes partly due to the pressure 
acting on the comoving volume. It is satisfying to see the adiabatic version of the first law 
of thermodynamics dE = —PdV recovered in (15), as already explained in chapter VI.2. 

Next, we look at the v = j component of (14). What could it possibly tell us? Go ahead, 
take a guess before reading on. 

Well, once again looking up in (8) the Christoffel symbols we need, we have D,,T/ = 
Sy ial oo Re Baca ay ia =9,74 +P) 74 +P) 7", Plugging in TY = £2, we see that 
D,,T“! =0 means that 


P... PS fey gt sts 1 ros ee ee 
a, (4) re (Fa! et ria) =5 [a73,P +P (2,84 erie + M2") =0 


Since we are assuming spatial homogeneity, P (and anything else, for that matter) cannot 
depend on x’. Physically, the pressure gradient 0; P must vanish in the comoving frame, 
since otherwise some sort of counterflow must occur to cancel out the pressure gradient. 
The three terms in the round parentheses multiplying P collect nicely into the (spatial) 
covariant derivative D;&'/ of the spatial metric, which vanishes identically, as we learned 
back in chapter V.6. 

Thus, the correct guess is that D,,T“/ = 0 tells us nothing at all: it is identically satisfied. 
Did you pass the test? I know that all we are doing here is verifying the laws of arithmetic, 
but nevertheless, I find it quite satisfying to see all the pieces coming together to form 0 
identically. 


Closed, open, and flat universes 


We are now ready to solve the field equation (1). Using (13), we find T = p — 3P and hence 
Suy = (p + P)U,Uy — A(p — P)8yy, with Sop = 4(0 + 3P), and S;; = 1p — P)g;; = 
5(p — P)a*g;;. Recalling (9) and (12), we obtain 


3R 
Roo = ra 4 G(p + 3P) (16) 
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and 

Rj = (2k? + RR + 2k) %j/L? =42G(o - P38; (17) 
As might be expected, g;; cancels out, leaving behind 

2R? + RR + 2k =40GRi(p — P) (18) 


You now understand perfectly well that, thanks to the Bianchi identity, only one linear 
combination of (16) and (18) is independent. Obviously, we would rather not deal with 
second derivatives if we can help it. Using (16) to eliminate R in (18), we finally end up 
with a remarkably simple first order differential equation 
81G 
rae 


R+k= R? (19) 


Note that P does not appear in (19). 
To determine cosmic expansion, solve (19) together with the conservation of energy and 
momentum (15), which I reproduce here for convenience: 


d(pR*)=—PdR?, or = —3—— (20) 


(Note that dt cancels out.) Of course, we also have to say what the universe is filled with 
by specifying an equation of state P(). Once given this, we can then solve (20) for p as a 
function of R, which we can then plug into (19) to solve for R(t). 


A Newtonian mnemonic 


Remarkably, a pseudo-derivation of the central equation (19) of Einsteinian cosmology can 
be concocted using Newtonian mechanics. In a Newtonian universe filled with a constant 
mass density (never mind that such a universe does not really make sense), consider a 
large sphere of radius R(t) and an infinitesimal unit mass on the surface of the sphere. 
The unit mass has kinetic energy 5R * and, by Newton’s superb theorems, potential energy 
—G (47 R?/3)o/R. By energy conservation, its total energy 5 R* — G(47 R?/3)p/R should 
be conserved. Calling this constant —}k, we obtain (19) and even understand where the 
87/3 comes from! 

For k = —1, the total energy is positive, indicating that the Newtonian sphere could 
expand indefinitely, roughly corresponding to an open universe. For k = +1 and negative 
total energy, the sphere would ultimately have to yield to gravity and contract. 

I do not take this Newtonian pseudo-derivation seriously but value it as a highly useful 
mnemonic that could also serve to motivate pedagogically the subtle physics contained 
in (19). 


The universe expands according to what it is full of 


To me, it is amazing that cosmic expansion is governed by two simple equations (19) 
and (20). (Of course, this is largely due to the perfect cosmological principle.) In the next 
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chapter, we will solve these two equations in detail. To get oriented, let’s solve them here 
in various simple situations. Keep in mind that the dimensionless a and the dimensional 
R = Roa are related trivially, and we can easily pass from one to the other. 

First of all, we will see a posteriori that in studying the early universe, we can neglect 
the curvature term k in (19), which thus becomes 


2 x pa? (21) 


a 
Secondly, as we will see a posteriori, to a good approximation, we can study the universe 
filled with only one kind of stuff at a time. 

Fill the universe with nonrelativistic matter, sometimes referred to as dust. As explained 
back in chapter III.6, the equation of state is simply P = 0. Plug this into (20) to obtain 


1 
po«x-— matter (22) 
a 


which, when inserted into (21), gives a 24 o 1, which implies that 


2 
3 


acxt3 matter (23) 


As another example, fill the universe with radiation (perhaps more accurately referred 
to as relativistic matter), characterized by P = p/3, once again as explained back in chap- 
ter III.6. Plug this into (20) to obtain d(pa?) + jda3 = 0, giving 


px as radiation (24) 
a4 
which, when inserted into (21), gives aa « 1, which implies that 
acti radiation (25) 


With p going like either 4 or +, the right hand side of (19) blows up like either + or 
1 
me 
universe is entirely justified. You can solve (19) with the curvature term (see exercises 1-4) 


respectively, as a > 0. Thus, our neglect of the curvature contribution k in the early 


and verify this claim. In any case, the present observation evidence favors a flat universe, 
as was first described in (V.3.2). 

You can understand both (22) and (24) easily using elementary physics. 

For nonrelativistic matter, think of a bunch of nucleons or atoms sitting in a box of 
linear dimension of order a. The energy density p is entirely due to the mass density. (The 
kinetic energy is negligible by comparison, hence the pressure is negligible.) As the box 
expands, the energy density p decreases like 1/a?, merely because the volume of the 
box has increased like a>. Hence (22). 

For relativistic matter or radiation, think of a photon gas characterized by a temperature 
T. The properties ofa photon gas are derived in detail in textbooks on statistical mechanics, 
but for our purposes, we can simply use dimensional analysis, as we have done already 
in chapter VII.3. There we showed that, in natural units, energy density p ~ T* and the 
entropy S ~ VT%. Since S is conserved as the box expands adiabatically, T « 1/ Vi~1 /a. 
Hence p ~ 1/a‘*, in agreement with (24). This dimensional argument underlines the fact 
that the conclusion T « 1/a holds only for a strictly massless particle.’ 
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radiation 
p,(a) 


matter 
Pm (a) 


cosmological constant 
pata) 


a 


Figure 1 A schematic log-log plot of » versus the universe’s scale factor a 
for radiation, matter, and the cosmological constant. As the universe evolves, 
matter eventually dominates over radiation. As the universe evolves further, the 
cosmological constant, which had been insignificant all along, eventually dominates 
over matter. The cosmic coincidence puzzle is: Why now, when we are around? 


The universe dominated 

As we go back into the early universe, a > 0. Since radiation density goes like p, « - 
while matter density goes like p,, « > radiation eventually dominates over matter. See 
figure 1. The universe started in a radiation dominated era and expanded into a matter 
dominated era. 

Since the temperature of the radiation T ~ 1/a, as we go back into the early universe, T 
keeps on increasing. As we will see in more detail in chapter VIII.3, in the early universe, 
atoms and molecules were dissociated into nucleons and electrons, and even earlier, 
nucleons in turn were dissociated into quarks and gluons. As T increases, the masses 
of various particles become negligible compared to their kinetic energies of motion, so 
that everybody becomes “radiation,” or more accurately, relativistic matter. Eventually, the 
temperature formally reaches infinity, our equations become singular, and we have reached 
the Big Bang. 

Conversely, as the universe expanded, the radiation dominated era eventually gave way 
to the matter dominated era. Thus, as we anticipated, during its evolution, the universe is 
dominated by one kind of stuff or another. 

The preceding analysis indicates that we can cover both matter and radiation with the 
generic equation of state P = wp, where evidently w = 0 for matter and w = ; for radiation. 
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Indeed, this equation of state even covers the cosmological constant, for which P = —p, 
as was first mentioned in chapter VI.2. Again, plugging this equation of state into (20) 
to obtain d(pa*) + wpda?® = 0, we find p « <a and hence a « 1 Hit) by (21), thus 
recovering our previous results as special cases. 

In particular, for the cosmological constant, w = —1, and the universe expands according 
to aa t™, which is just code for the exponential e”’ behavior we found in chapter 
VI.2. Correspondingly, p, «a°, which is of course just a check, since we defined the 
cosmological constant to be, duh, a constant. An important remark: in the early universe, 
as a — 0, in light of (22) and (24), p, becomes negligible compared to p, and pp. 


Critical density 


In the previous section, to get oriented in cosmology and in the cosmos, we solved the 
are pR? for the flat k = 0 universe. But it is not difficult 
to see qualitatively what goes on in a closed universe 


cosmological equation R? + k = 


. 8 
R41= oR closed universe (26) 


or an open universe 


_ 81G 


R-1= ; pR? — open universe (27) 


Once you reach a qualitative understanding, a quantitative understanding is merely a 
matter of showing off your ability to solve a first order ordinary differential equation. 
Define the critical density 
3R? 


ene ae 28 
82 GR2 (28) 


Pc 


Note that p,(t), which evidently is always nonnegative, depends on time in general. Beware: 
by the term “critical density,” some people mean exclusively its present value p,(to). 
Next, divide (26) and (27) by R? to obtain 


Le +R? closed universe (29) 
Pc 


for a closed universe and 


£a1-R? open universe (30) 


Pc 
for an open universe. Thus, to close the universe, p must be greater than the critical density 
/P;. That sure makes sense: you need lots of stuff to curl space around to make it close upon 
itself. In contrast, for p < p,, the universe is open. 


Consider a universe filled with only matter. Then from (22) p = poR3/R?, where po 
81 GpoR? 
3R 
that as the universe expands and R increases, the right hand side will eventually decrease 


— 1, and we see 


evidently denotes the present density. For a closed universe, R? = 
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to 0, so that R = 0. The universe stops expanding and starts to contract, as described by 


80 GpoR? 
aR 


: , ; 8 GpoR? : 
In contrast, a matter filled open universe, obeying R = ,/ ao +1, will expand 


forever, eventually reaching R ~ 1, with a curvature driven expansion, even as the matter 


the negative root in (26): R = — 1. 


density dilutes to practically nothing. 

Next, consider a universe that is empty except for the cosmological constant. Of the 
three kinds of stuff to fill the universe considered here, the energy density p, = A can be 
negative, in contrast to matter density p,, and radiation density p,. 

From (29), we see that in a closed universe, 0, > p, and so a fortiori cannot be negative. 
But (30) tells us that in an open universe p, can be either positive, in which case py < p,, 
or negative with no restriction. From (27), we have R? = 1+ 832 AR?. We see that if A > 0, 
the universe will expand forever, while if A <0, the universe expands until R reaches 0 
and then starts to contract. As has already been mentioned, the evidence at present points 
to a positive A. 


Big Bang: From no space to space 


We have avoided dealing with the time-time component (16) of Einstein’s field equation 
— 38 = 41G(p + 3P), seeing that it involves a second derivative. However, it does convey 
an important message: as long as p + 3P > 0, the acceleration R <0, so that R always 
decreases, regardless of whether the universe is closed, flat, or open. Hence the curve 
R(t) is convex downward. At present, R > 0, since we see redshifts. Extrapolating the 
curve backward, we have thus proved that R(t) = Roa(t) must vanish at some point in the 
past. (See figure 2.) The metric in (2) dt? = —ds* = dt? — a*(t)8;;(X)dx'dx/ degenerates 
to dt* = dt*. No space! This spacetime singularity at which space disappears is known as 
the Big Bang.” 

As long as there is a component of p that increases faster than 1/R? as R > 0, we 
can use (19) to reach the same conclusion we just proved. As we go back in time into 
the early universe, the right hand side blows up, the curvature term in (19) becomes 
irrelevant, R — oo, and so R(t) eventually must vanish. This argument also indicates 
that a universe containing only the cosmological constant could evade this argument and 
avoid having a Big Bang in its past (as we have seen in chapter VI.2, and as we will see in 
chapter IX.10). 

As authors of popular physics books on the universe know, the most common mis- 
conception of the proverbial man in the street regarding the Big Bang is that it describes 
some kind of terrific primordial explosion, spewing matter every which way into space. 
You of course know better. The Big Bang is actually the creation of space: from no 
space to space, stretched by the factor a(t) ever since. 


* Originally a derogatory term used by Fred Hoyle to champion the steady state theory of the universe. 
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a(t) 


% . 
Big Bang c 


Figure 2 If p + 3P > 0, the scale factor of the universe is concave downward, 
and thus must vanish at some point in the past. 


Before the discovery of dark energy, it was thought that o + 3P > 0 imposes a rather 
weak condition that surely the content of the universe must satisfy. Certainly, both rela- 
tivistic matter P = 0/3 and nonrelativistic matter P = 0 satisfy this condition with room to 
spare. Well, the dark energy, with P = —p = —A, is able to violate precisely this condition. 
Indeed, the simple universe discussed in chapter VI.2 with a(t) = e”’ does not have a Big 
Bang in its past. 

In the next chapter we will study the cosmological equation in more detail. 


Coincidence problem 


With p, «1/a*, py «a, and p, «a°, the universe passes through three epochs: a ra- 
diation dominated epoch early on, followed by a matter dominated epoch, which will 
eventually give way to a dark energy dominated era. This brief history of the cosmos im- 
mediately poses a coincidence puzzle. In the vast sweep of cosmic time, the period during 
which p,, is comparable to o,, as is the situation now, represents but a blink. Is it a co- 
incidence that we happen to inhabit the universe just as the two curves p,,(a) and p, (a) 
are crossing each other? Or is there a deeper reason? Perhaps more likely, in my humble 
opinion, our understanding of cosmology is simply incomplete. 


An apparent paradox: More stuff makes the universe expand faster 


Readers of popular physics books, and quite a few beginning students as well, are often 
puzzled by Einstein’s cosmological equation R* + k = ®&% p R?: it says that more stuff (a 
larger p) would make the universe expand faster (a larger R). You might have thought that 
the gravitational attraction exerted by a larger p would hold everybody back, thus slowing 
down the expansion rather than speeding it up. What gives? 
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I already alluded to the resolution of this apparent paradox. A particularly illuminating 
discussion invokes time reversal invariance. Einstein gravity is time reversal invariant, 
that is, the Einstein-Hilbert action is unchanged upon t > —t. If we take a movie of 
the expanding universe and run it backward, the plot of the backward-running movie, 
namely the story of a contracting universe, must also be allowed by the laws of physics. 
Mathematically, this is implied by the appearance of R? in (19) and hence the two solutions 


= i,/ aa pR? — 1. Thus, ifyour intuition tells you that more stuff should speed up the 
contraction, then it also tells you that more stuff also speeds up the expansion. 

The common confusion is basically between velocity and acceleration,* between R 
and R. A somewhat less illuminating resolution of the apparent paradox is to invoke 
the acceleration equation (16) we eliminated, namely 34 = —41G(p + 3P). If matter 
is normal,’ that is, P > 0 and increases with increasing p, then more stuff (large p) 
decelerates the universe more. 


Einstein should do penance 


In a recent historical study, Niissbaumer and Bieri recounted the early history of the 
universe at considerable variance from the cartoon history given in many popular accounts. 
They made clear that Lemaitre deserved much more credit than he had traditionally 
received, and others less. They concluded their book by imagining, amusingly, a dinner® 
gathering Einstein, de Sitter, Lemaitre, Eddington, and Hubble. Lemaitre emerged as a 
triple winner, for his* expanding universe,’ for his seminal idea on what developed into 
the Big Bang, and for associating the cosmological constant with the vacuum energy. 
While the party toasted the tragically departed Friedmann,® Einstein should, according 
to Niissbaumer and Bieri, “do penance.”® 

A small story from this laudably balanced history is illuminating. After his friends 
Eddington and de Sitter had both converted to the expanding universe, Einstein changed 
his opinion also. In 1932, Einstein and de Sitter were both visiting the California Institute 
of Technology, and they coauthored a paper that by all accounts would not have passed 
the refereeing system!® had it not been authored by two big names. Nothing they said 
had not already been said earlier by Friedmann, Lemaitre, and Robertson. Eddington later 
wrote:!! “Einstein came to stay with me shortly afterwards, and I took him to task about it. 
He replied, ‘I did not think the paper very important myself, but de Sitter was keen on it.’ 
Just after Einstein had gone, de Sitter wrote to me announcing a visit. He added: ‘You will 
have seen the paper by Einstein and myself. I do not myself consider the result of much 
importance, but Einstein seemed to think that it was.’” 


* It is worth quoting from a letter from M. Way and H. Niissbaumer to Physics Today, August 2011, p. 8. “It 
is widely held that in 1929 Edwin Hubble discovered the expanding universe and that his discovery was based 
on his extended observations of redshifts in spiral nebulae. Both statements are incorrect. . . . There is a great 
irony in these falsehoods still being promoted today. Hubble himself never came out in favor of an expanding 
universe; on the contrary, he doubted it to the end of his days.” I am among those who oppose the continual 
promotion of falsehoods in physics. 
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Exercises 


1 


Solve the cosmological equation R* + 1= — pR? (19) for a radiation dominated closed universe. Plot your 
result. Verify in exercises 1-4 that the curvature is negligible in the early universe. 


Solve the cosmological equation for a radiation dominated open universe. Plot your result. 


Solve the cosmological equation for a matter dominated closed universe. Plot your result. 


Solve the cosmological equation for a matter dominated open universe. Plot your result. 


For a universe containing only a cosmological constant, show that R(t) = H~! cosh Ht if the universe is 
closed, and R(t) = H~' sinh Ht if the universe is open, with H? = 8GA/3. Note that in the closed case, 
the universe does not have a Big Bang; that is, R(t) never does vanish. Recall also the result R(t) = Roe# t 
for a flat universe. 


Notes 


. See for example, E. D. Kovetz, A. Ben-David, and N. Itzhaki, “Giant Rings in the CMB Sky,” Astrophys. J.724 


(2010), pp. 374-378. 


. Consider a box of neutrinos, which are known to have a very small mass m. As the box expands, elementary 


quantum mechanics shows that the momentum p « 1/a. When p drops below m, the neutrinos become 
nonrelativistic, with an average kinetic energy of p?/(2m) « 1/a?. The temperature of the neutrino gas, 
defined to be the average kinetic energy, would then drop like T « 1/a?. 


. Another useful way of plotting the behavior of the universe in different eras, which I learned from J. Bjorken, 


is suggested by (21) and the various dependences of p on a. Plot log a versus log a, that is, do a log- 


log plot a versus a. For dark energy (the cosmological constant), log ad = log a + constant; for radiation, 


log 4 = — log a + constant; for matter, log 4 = —} log a + constant. 


. When I was a freshman, it was announced that John Wheeler would give an experimental (in the sense 


of pedagogy rather than physics) course to a handpicked group of beginning students. Wheeler asked the 
assembled students a series of questions to separate the goats from the elect, so to speak. I still remember 
the question that eliminated the largest number of hopefuls. Does a tossed ball have zero acceleration at the 
top of its flight? 


. Or, somewhat less restrictively, assume the equation of state P = wp and (14+ 3w) > 0. 
. H. Niissbaumer and L. Bieri, Discovering the Expanding Universe, p. 187. 
. Indeed, there is some shady business for a budding historian of physics to look into and clarify. Since 


Lemaitre’s seminal 1927 paper was published in French in an obscure Belgian journal, Eddington arranged 
for it to be republished in English in 1931. But the two crucial pages containing Lemaitre’s estimate of the 
so-called Hubble constant were omitted in the English translation. Smells rather fishy to say the least. Some 
reader should track down the person responsible for this omission. 

By the time I was finishing this book, a couple of years after I wrote the words above, I learned that 
M. Livio (Nature 479 (2011), p. 171, http://www.nature.com/nature/journal/v479/n7372/full/479171a.html) 
had indeed tracked down the relevant documents and concluded that it was Lemaitre himself who deleted 
the crucial pages. One of Livio’s conclusions was disputed by S. van der Bergh in a letter to the editor (Nature 
480 (2011), p. 321). 

Based on what I read while writing this book and also my earlier popular book (An Old Man's Toy), I feel that 
the kindest thing I can say about Hubble is that he went out of his way not to acknowledge the contributions 
of his contemporaries. I hope that Hubble’s status in cosmology will be reevaluated in the future. 


. For the story of how Aleksandr Friedmann died at the young age of 37, see Toy/Universe p. 85. 
. H. Nissbaumer and L. Bieri, Discovering the Expanding Universe, p. 187. 

10. 
11. 


Ibid., p. 148. 
Ibid., p. 128. 
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e and Dark Energy 


A cosmic diagram 


The goal of this chapter is to derive the diagram shown in figure 1, which is sort of a 
phase diagram describing the overall history of the universe according to what it contains. 
The two axes are labeled by Q,, 9 and Q, 9, two parameters that we will define in this 
chapter and that measure how much matter and how much dark energy, respectively, 
the universe contains at present. Here the term “matter” includes both dark and baryonic 
matter, with the bulk (more than 80%) of it in dark matter, as we will see. As in chapter VI.2 
and the preceding chapter, dark energy is presumed to be the manifestation of Einstein’s 
cosmological constant. Thus, to a first approximation, you can think of the diagram as 
describing the struggle between dark matter and dark energy. 

In this highly simplified picture, the universe is specified by Q,, 9 and Q, 9. Notice that 
this cosmic diagram contains two straight lines and two curved lines, dividing different 
types of cosmic behavior. 

Above the straight line Q, 9 = 5 Qun,0° cosmic expansion accelerates: the universe will 
expand faster and faster. Dark energy overwhelms dark matter. Below this line, cosmic 
expansion decelerates. 

Next, look at the line defined by Qu, 9 + Qa 9 = 1. The universe is spatially closed above 
this line and open below it. A universe sitting right on this line is spatially flat. As explained 
in the preceding chapters, in Einstein gravity, stuff curves space, so space can curl up on 
itself. Lots of stuff in the universe closes it, while not enough stuff leaves it open. 

The Big Bang is defined as the singularity in spacetime when the scale factor of the 
universe a vanishes (as described in the preceding chapter). In figure 1, a curved line 
starts from the point (Qy,9, 2a.) = (0, 1). Below this line, the Big Bang banged. Above 
this line, no Bang. 

Indeed, back in chapter VI.2, we studied the universe described by (Qy,9, 24,0) = 
(0, 1), a flat universe with no matter, only a cosmological constant. We found there and in 
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Ono 


Figure 1 A cosmic diagram describing the overall history of the universe according to how 
much matter and how much dark energy the universe contains at present. 


chapter VI.5 that a(t) = e”' grows exponentially with the constant H = 87 GA/3. Thus, a 
never does vanish, and there was no Big Bang. In other words, the point (Qy,,9, 24,0) = 
(0, 1) belongs to the no Big Bang phase. The curved line tells us how much matter has to 
be put in to produce a Big Bang. 

You may have noticed another curve starting from the point (Qy,,9, 2a ,9) = (1, 0) and 
barely curving upward. Now consider a curve consisting of two pieces joined together, 
namely the curve just described and a straight line segment consisting of the portion of the 
Qyy,9 axis between the point (Qy, 9, 24,9) = (1, 0) and the origin (Qy, 9, 2,9) = (0, 0). 
This composite curve defines the boundary between two phases. Above this line, the 
universe will expand forever. Below it, the universe will eventually stop its expansion and 
contract. 

Study this figure and decide if it makes sense to you. Observational evidence suggests 
that our universe lies inside the circle around (Qy,,9, 24,9) = (0.3, 0.7). Thus, according 
to the cosmic diagram, our universe is flat and accelerating, with a Big Bang in its past 
and never-ending expansion in its future. 


The cosmological equation 


The derivation of the cosmic diagram starts with the cosmological equation 


81G 
—p 


R+k= R? (1) 
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given in the preceding chapter. Define the Hubble* parameter 
a 
Wee (2) 
a 


Notice that we said Hubble parameter, not Hubble constant. Unless a(t) is a pure exponen- 
tial (as in chapter VI.2), H will vary with time. In general, we will indicate the present value 
of a variable by the subscript 0. Thus, Hp is the present value of the Hubble parameter. 
Using (2), we rewrite (1) as 


pmo (3) 


What time is it over there? 


For the cosmic clock, we have several choices.1 We could use the time from the Big 
Bang ft, or equivalently, the scale factor a(t). Physically, a better choice is the ambient 
temperature T of the universe during the event under discussion (for example, radiation 
decoupling, which occurred at T ~ 0.3 eV, as will be explained in the following chapter). 
But observational cosmologists quite naturally use the redshift z. In chapter V.3 we derived 
for light emitted at time f, the relation 


1 
a(te) 


1l+z= (4) 


where we have set a(t) = 1 by convention. 
Also, in chapter V.3, in defining proper distance, we encountered the integral R = 


£ cai which we can now convert to an integral over redshift as used by cosmologists: 
i 


1 da [ da Pr az 
a 


R = T= ) — 
a(te) aa (te) 4° 0 A(z’) 


(5) 


Filling up the universe 


The energy density p = )'; e; may consist of several components. For example, the index 
j can take on several values: j =r,m, and A, indicating radiation, matter, and cosmological 
constant, respectively. For the purpose of this chapter, baryonic matter, luminous or not, 
is lumped in with dark matter and collectively referred to as matter. As already mentioned, 
we assume that the dark energy is a manifestation of the cosmological constant. It is 
sometimes convenient to lump the curvature term in (3) into the energy density by defining 


aa a <. With these definitions, we may write (3) as 


8G k 82G 
2 y = y 
J n 


* Regarding Hubble’s discovery of the expanding universe, see the footnote on page 500 in the preceding 
chapter. For the story of an unschooled mule driver who contributed to Hubble’s discovery, see Toy/Universe, 
p. 52. 


Vill.2. Cosmic Struggle between Dark Matter and Dark Energy | 505 


where the index n runs over the set the index j runs over plus k. (Does the last phrase 
make sense? If not, read the preceding sentence.) 

A key physical feature of our present understanding of cosmology is that these different 
ingredients do not interact with one another and are tied together only by gravity. You 
might think that stars produce light, thus converting matter into radiation, but on the 
cosmic scale, this effect is totally negligible, so that p,, and p, evolve independently. 

Let us divide (6) by H* and define 


82 G Pj 
nS Sp ye 7 
ary aes (7) 
and 
k k 
Ok = age ~~ Be " 


(Note that the sign of 2, is opposite to that of k.) In (7), we have recalled that in the 
preceding chapter, we defined the critical density? p, = te = Eee Thus, Q; has the 
pleasing interpretation as the ratio of the density of the “jth kind of stuff” to the critical 
density. 

As we will see, there is some arithmetical advantage to regarding the curvature term as 
a kind of density, but physically, you should keep in mind that it originates in geometry. 


With the definition 
2=) 2; (9) 
J 
(notice that Q does not include the curvature contribution), we may rewrite (6) as 


1=Q24+Q=)°2;)4+2= >> 2, (10) 
J n 
We can think of this as telling us that stuff plus curvature equals unity. 

As discussed in the preceding chapter, the parameter Q determines whether our uni- 
verse is closed, flat, or open, according to whether Q > 1, Q=1, or Q <1, since corre- 
spondingly k could be equal to +1, 0, or —1, respectively. For many years, it was believed 
that the universe was closed, but recent observational evidence indicates that Q is very 
close to 1, so that Q, ~ 0 and the universe seems quite flat. 

Thus far, we have been merely defining and rewriting. Much of this defining and 
rewriting is to connect the terminology used by theoretical physicists to that used by 
observational cosmologists. Recall that in the preceding chapter, we found that as the 


universe expands, p; varies according to p; aa = a Let me remind you that 
ie 
w, = 3, Wm = 0, wa=-l (11) 
so that 
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Hence, setting ay = 1, we have 
2 
Hy\? 1 
Qe 22) —==9; 13 
(32) avi °° 


(As usual, the subscript 0 indicates the present value of various quantities.) It is also 
convenient to define the analogous expression for Q, with y, = 2. We can then rewrite 
(10) as 


A : Qno 
1= (22) Ds a, (14) 


n 


or 


Q Q Q Q 

2 = 2 n,O 2 m,0 1,0 k,0 

any a 1 ( ee +2y,0+ 82) (15) 
n 


The observed values for these cosmological parameters are 
Qy.,.0~97, Qno 03, % o~5x 10 (16) 


and Hy ~ 70 km/sec/Mpc ~ 2 x 10718 sec, where 1 Mpc = 10° parsec or ~3 x 101? km. 

The matter contribution Q,, 9 to Q consists of dark matter Qa, 9 ~ 0.25 and baryonic 
matter Qy 9 ~ 0.04. The surprising discovery has been that the baryonic matter we know 
and love comprises only a teeny contribution to the energy budget of the universe. Another 
remarkable cosmological fact is that only a small fraction of the baryonic matter Q, ~ 0.008 
resides in stars; the rest appears to be in interstellar and intergalactic gases. 

Keep in mind that our index j runs over 1, m, and A, while the index n runs over the 
range of j plus the curvature term k. 


Constructing a 2-dimensional map of universes 


It may seem like the height of hubris, but within the context of this discussion, we can 

characterize our universe in terms of three present-day values {Q, 9, 2m, 2,,o}- In fact, 

since Q, 9 K Qm,o, 2p,9 We can make do with a 2-dimensional parameter space with the 

two axes Qy,9 and Qa 9, which we could happily plot (see figure 1) on a piece of paper. 

Note that the universe considered in chapter VI.2 sits at the point {Q,, 9 =0, Qa 9 = 1}. 
Draw the line 


Qmo + Lao =1 (17) 


From (10), Qing + 24,0 + 2k,0 = 1, we see that the universe is closed above this line and 
open below it. A universe sitting right on this line is flat. 


Acceleration or deceleration? 


At this point, let us also make use of the other Einstein field equation 


Ro a 4nG AnG 
ey et BP) = (1+ 3w,) p; (18) 
a 


j 
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which describes the acceleration of the cosmic expansion. Define the deceleration 


parameter 
_ aa R/R 1 
1 =~ Raye =+} d/ (1+ 3w,) Q; 
J 
= § (22, + 2m — 224) (19) 


(Note that for many cosmological parameters, we could freely choose to use a or R.) That g 
is defined with a minus sign (and thus known as the deceleration parameter) is historical: 
before the discovery of the dark energy, it was thought that all possible values of w; were 
positive and that the cosmic expansion was thus decelerating. You are of course free to 
define the acceleration parameter Q = a if you like. 

As explained in the preceding chapter, in the absence of the cosmological constant, or if 
the cosmological constant A is negative, then g is manifestly positive, and the expansion 
of the universe will slow down. 

Since from (19), we have 


=qo = 24,9 — 5Q2m,0 (20) 
cosmic expansion is accelerating above the line 
24,0 = 32m,0 (21) 


and decelerating below this line. 


The fate of the universe 


Will the universe expand forever? Did it have a Big Bang? 

By now, you should have enough understanding to answer the following questions 
qualitatively. If dark energy overwhelms dark matter, did it bang? Yes or no? 

Let us now quantify the word “overwhelm” by determining the two curved lines in figure 
1. Write (15) as 
1 7) (22 Q 


mb 4 2 4 9a? +0) =0 (22) 


which we can interpret as a Newtonian problem of particle of mass m = 7 with zero total 
0 

Qy, 0 Qy, 0 

a + a2 


Qo =1-— (Q*y 0 + 2y,9 + 2,0) by evaluating (15) at the present time. 


energy moving in a potential V(a) = —( 


+ Qy 9a* + Qx, 9). We can eliminate 


Perhaps astonishingly, the Newtonian mechanical analog keeps popping up in general 
relativity, whether we are studying the motion of a particle around a black hole or figur- 
ing out how the universe evolves. By now you surely understand why this is so. In the 
present context, Einstein’s field equation involves, by construction, two powers of deriva- 
tives. Because of the perfect cosmological principle, the entire metric is described by one 
function of a single variable, so that we have ordinary, rather than partial, differential equa- 
tions. Thanks to Bianchi’s identity, we can eliminate the second derivative d. As a result, 
we happily end up with a problem in Newtonian mechanics, which we can readily solve 
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Omax 


Omax 


(a) (b) () 


Figure 2 The cosmic potential for some representative values of (Qy,,9, 24,9) plotted against the scale factor of 
the universe. (a) For (Qm,9, 24,0) = (0.3, —0.8). The universe started in a Big Bang, with the present expansion 
headed toward an eventual collapse back into a Big Crunch. (b) For (Qy,9, 24,9) = (0.3, 0.8). The universe 
started out in a Big Bang and then proceeds to expand forever. (c) For (Qy,9, 24,9) = (2.2, 0.03). Note that this 
is for large Qy,, 9 and small Qy 9, and that the maximum of the potential barely sticks above the horizontal axis. 
Either the universe started in a Big Bang but will eventually collapse back, or the universe never did have a Big 
Bang and will expand forever. 


numerically in the general case. In fact, setting Q, 9 = 0, we can analyze the problem 
completely, as we will now show. 

To a good approximation, then, we have a Newtonian particle with zero total energy 
moving in the potential (see figure 2) 


Q 
V(a)=- ( ae 4,00) — (1— Quo — 24,0) (23) 


a 


Note that this cosmic potential is the sum of three terms: a —1/a term; an a? term, whose 
coefficient can take either sign; and a constant. 

For small a, the potential V ~ ine is attractive like an inverse square law; for large 
a, V ~ —Qy a’ is repulsive or attractive, according to whether Q, 9 > 0 or < 0, like an 
inverted or a normal harmonic oscillator, respectively. The constant term in V moves the 
potential up or down. The boundary condition is that at present, a= 1anda > 0. 

We can analyze the negative cosmological constant Q, 9 < 0 case instantly. The potential 
V (a) = —2™2 + |Qq gla? — (1 — Qym,9 + |p, ol) is entirely attractive. See figure 2a. The 


a 


particle is at present climbing the hill, but when it reaches the point where V (a) = 0, its 


velocity vanishes (a4 = 0), since it has zero total energy, as you recall. It then turns around 
and slides back down the hill. In other words, the universe started in a Big Bang, with 
the present expansion headed toward an eventual collapse back into a Big Crunch, when 
a(t) will once again vanish. This takes care of the entire lower half plane Q, 9 < 0 in the 
cosmic diagram. 

We now take Q, 9 > 0. The potential now reaches a maximum at some a,x. There are 
two possibilities, as shown in figure 2b and figure 2c, according to whether the maximum 
of the potential sits below the horizontal axis (that is, V(amax) < 0), or sticks up above the 
horizontal axis (that is, V(dmax) > 0). 

Again, recall that the Newtonian analog particle has zero total energy. In the situa- 
tion described in figure 2b, it has enough energy to reach the top of the potential and 
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then cruise down. The universe started out in a Big Bang and then proceeds to expand 
forever. 

We have to divide the situation described in figure 2c further into two cases, according 
to whether dpa, > 10% dmax < 1. 

Suppose that a,, > 1. Since at present, a = 1, by definition, with a > 0 by observation, 
then, as represented by the Newtonian particle, we have not yet gotten to the top of the 
hill, which we are determinedly climbing. But when we reach V(a) = 0, we will run out 
of steam; not having enough oomph to reach the top, we will fall back in toward a = 0. In 
other words, the universe will eventually collapse. 

Now suppose that aya, < 1. Then we, presently at a = 1 with a > 0, are already on the 
other side of the hill and are happily rolling downhill with ever increasing speed. We have 
never even been to a = 0: there was no Big Bang in our past. The universe never did have 
a Big Bang and will expand forever. 

Note that in both of these cases shown in figure 2c, the Newtonian particle, with its 
zero total energy, can never have gotten above the horizontal axis. The entire history of the 
universe is described by the piece of V(a) below the horizontal axis, either the piece on 
the left or that on the right. 

We thus see that the dividing lines between these different scenarios are determined by 
first finding out if V’(am,) = 0 has a solution, and if it does, then setting V(a,,,) = 0. 
We can use one equation to eliminate a,,,, in the other, thus obtaining a relation between 
Q,,o and Qo, leading to the curves shown in the cosmic diagram. To determine the 
behavior on the two sides of these curves, we have to further ascertain whether a,,,, > 1 
OF Amax < 1. 

Quite remarkably, at this stage, to work out the cosmic diagram requires no more 
than high school algebra, not even solving a differential equation. You should challenge 
yourself before reading the solution, which I will relegate to the appendix. Observation- 
ally, as I have already mentioned, the favored region forms a small circle* centered at 
(Qin, 2a,0) = (0.3, 0.7), far from the two curves we just discussed (see figure 1). Never- 
theless, theoretically it is quite interesting to work out these two curves. 


Einstein’s static universe and his second greatest blunder 


In the era when Einstein ever so boldly’ ventured to apply his theory of gravity to the 
entire universe, physicists were philosophically prejudiced in favor of a static universe. 
Indeed, the expanding universe that we all, including the proverbial person in the street, 
take for granted was inconceivable once upon a time. When Einstein found, to his horror, 
that his field equation implied an expanding universe, he solved the perceived difficulty 
by introducing a cosmological constant and showing that he could have a static universe if 


* The size of the circle changes as observation improves. 
¥ Indeed, fearing for the asylum. See the opening of chapter VI.2. 
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A= 5 Pm (as you can also show in exercise 3). He thus missed a tremendous opportunity 
to predict that the universe expands. 

For those who delight in yakking about Einstein’s greatest blunder, an idle preoccupation 
that strikes me as somewhat unseemly, I have already expressed my humble opinion that 
the greatest blunder was not the introduction of the cosmological constant, as the popular 
press would have it (which is in fact required by quantum field theory, as I explained 
in chapter VI.2 and will discuss in more detail in chapter X.7, and in any case appears 
observationally to be here to stay). Rather, it was his failure to use the action principle. 
With your indulgence, we will now talk about Einstein’s second greatest blunder, which is 
not to check whether his solution was stable. 

In light of the Newtonian analog potential in (23), this instability is starkly evident. Dif 


ferentiating and setting V’(a) = 3° — 2Q ,0@” to 0 and a to 1, we recover Einstein’s 


condition 22, 9 = Qy,9 expressed in the cosmologist’s language. Einstein’s static uni- 
verse corresponds to sitting at the maximum of the potential in figure 2b. 


Flow in the cosmic diagram 


It is perhaps worth emphasizing that all relevant physics within the present context is 
contained in the cosmological equation 


81G 
—p 


R+k= R? (24) 


and energy conservation 


3 
2 pa =—P “ (25) 
The discussion here is merely expressing the same physics in a notation particularly 
convenient for observational cosmology. 

Thanks to the Bianchi identity, we can recover the rest of Einstein’s field equation from 


(24) and (25). It is instructive to verify this. First, rewrite (25) as 
: R 
eas re (26) 


Next differentiate (24) to obtain an equation for R, eliminating R by using (24) and p by 
using (26). Not surprisingly, we get the time-time component of Einstein’s field equation: 


R A4nG 
“2-7 3P 27 
R 3 (0 + 3P) (27) 


You are of course free to keep on massaging these equations every which way. For 
example, we might ask how Q j varies with time. Recalling the definition Q j= an¢ 0 ey 


we have 


Q; = 2; (2-28) --9, (2(0+m) #424) (28) 
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On 


Figure 3 Neglecting radiation, we can picture the time evolution 
of the universe as a fluid flowing in the 2-dimensional space 
spanned by (Q,, 2,4), plotted here for a universe in an expanding 
phase. Of the three fixed points, (Q,,, 2,) = (0, 1), (0, 0), and 
(1, 0), only the one at (0, 1) is a stable attractor. 


where in the last step, we used (26). Now 


H 14d{[R 1/R 5 1 
H2 Hdt (5) H2 (3 7 a i) 2 


L 


using (19) in the last step. Inserting this into (28), we obtain the nifty equation 


Q; =+HQ; (-2»)-14 D (490) a) (29) 
The rate of change of Q; depends on the other Q;s. We can think of this as defining a flow 
in the space spanned by the Qs. 


For example, if we neglect Q, as before, we have 
Qyy = HQm (Nm — 22, —-1) and = Qy = HAA (Qy — 224 + 2) (30) 


We can think of this as defining a velocity field 1 = (Q,,, 2,) for a fluid flowing in the 
2-dimensional space spanned by (Q,,, Q,), which we plot in figure 3, assuming that the 
universe is in an expanding phase H > 0. To facilitate plotting the velocity field, notice 
that Q,, < 0 above the line Q, = 5 (Qe — 1) and > 0 below. Similarly, 2, <0 above the 
line Q, = 52m + 1and > 0 below. There are three fixed points, (Q,,, Q,) = (0, 1), (0, 0), 
and (1, 0), defined as places where the velocity field vanishes. The fixed point at (0, 1) is 
stable, known variously as an attractor or a sink in different areas* of physics, in the sense 


* In quantum field theory and in condensed matter physics, this kind of flow is known as a renormalization 
group flow. The quantum and thermal fluctuations responsible for the flow are, however, completely absent in 
the present context. 
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that a particle flowing nearby would end up there. In contrast, we have two unstable fixed 
points at (0, 0) and (1, 0). In particular, a universe at (0.3, 0.7) will eventually end up at 
(0, 1). But we know all this already: dark energy in the form of a cosmological constant 
will eventually overwhelm the dark matter. 

We see the essential role played by the cosmological constant. In its absence, we have 
a 1-dimensional flow along the Q,, axis, as shown in figure 3: if Qy, < 1, the universe 
flows toward Q,, = 0, and if Q,, > 1, the universe flows to arbitrarily large values. Of the 
two unstable fixed points in the 2-dimensional flow, the one at (0, 0) becomes stable if 
we restrict the flow to be along the Q,, axis. When a positive Q,, no matter how small, is 
introduced, the fluid flows away from the Q,, axis: Q, is known as a relevant perturbation. 
Again, this is just old knowledge repackaged: as the universe expands, matter density is 
diluted to nothing, while the cosmological constant remains constant. In a contracting 
phase, as indicated by (30), everything is reversed, of course. 


Is flat stable? 


soe . . . k 
Itis interesting to apply this flow language to the curvature density 0, = — qypgz- Note that, 
in contrast to k, the quantity Q; is continuous rather than discrete, and so it makes sense 
to talk about its rate of change. Going through similar steps as above, we obtain 


H 
O39; (2 + u) = 22H = HQ (22; + Qm — 224) Gy 


Is a flat universe stable? If Q, is strictly 0, that is, ifthe integer k = 0, then Q, = Oand Q, 
stays at 0. The issue is whether a universe with Q; ~ 0 flows toward or away from Q; = 0. 
Again, assume that the universe is in an expanding phase, so that H > 0. 

As you can see from (31), if 2Q, + Qy, > 22, (which is a fortiori satisfied if Q, <0 or 
if there is no cosmological constant Q, = 0), then Q, = 0 is an unstable fixed point. 

In contrast, if 2Q, + Qy <2Q,, then Q; = 0 is a stable fixed point. But since Q, and 
Qn vanish rapidly, given enough time, this condition will eventually be satisfied. We will 
return to this point in the next chapter. 


Age of the universe 
Since Hp has dimensions of inverse time, a rough estimate of the age of the universe is 


simply fae = 1/Hp, but given the accuracy to which the cosmological parameters are now 
known, we can do better. Using (15), we obtain 


age 1 da 1 da 
tie = dt = = 
0 0 a(s) 0 aH 


1 i: da (32) 
Ay Jo [Qm,0471 + Q,, 94-7 + Qy 94? + (1 = Qm,0 — 24,9 — AK, 0)]/? 
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In general, this will have to be integrated numerically. For a flat universe, if we neglect 
radiation, the integral can be done exactly: 


. og (V2.0 + V2m,o + A,0 
3A / 24,0 Qm,0 


fie ) 21,020, (33) 

Historically, the age of the universe presented a thorny issue for a long time, as it 
came out shorter than the age of certain stars and galaxies estimated reliably using other 
ae the 
discovery of dark energy has evidently served to resolve this problem. Indeed, we can now 


methods. Since a universe with only 2, 9 > 1 has no Big Bang and hence infinite 1, 


turn the argument around, so that we can use the inferred age of a galaxy observed at such 
and such redshift to set a lower bound on Q, 0. 


Appendix: Phase boundaries in the cosmic diagram 


To determine the curves in figure 1, it turns out that we have to solve a cubic equation. So let’s first recall the 
(hyperbolic) cosine and sine method for solving the cubic. A cubic equation can always be cast in the form 
4x3 — 3x — s = 0. For our problem, we want the real positive root (and in case there are more than one, the 
smaller of the two). For s > 1, there is one real positive root. Use the identity 4(cosh 6)? = 3 cosh B + cosh 36. 
Then the solution is evidently x = cosh 8, with 6 determined by s = cosh 3£. For 0 < s < 1, change the hyperbolic 
cosine in the preceding to a cosine. For s < 0, use the identity 4(sin 6)? = 3 sin 8 — sin 38. Then the solution is 
x = sin B, with B determined by s = — sin 3f. In particular, as s > 0, the solution x > 0. 

The case Q,, 9 = 0 was treated in chapter VI.5. The case Q, 9 = 0 is also easily analyzed by looking at a plot of 
the potential V(a) = — ea — (1— Qy, 9). For Qm9 < 1, the particle in the Newtonian analogy is unbound, and 
the universe expands forever. For Q.,,9 > 1, the particle is bound, and the universe expands and then collapses. 

Taking Qy, 9 £0, we divide V(a) by Qy, 9 for convenience, so that we can write the potential effectively as 


a ; ° The condition V’(dmax) = 0 then 


1 
V(a) = —a7! — 4x3a? — (s — 4x3) after defining x = (a) * ands = 


gives dmax = x which when substituted back, yields V (dmax) = 4x? — 3x — s. The condition V (dmax) = 0 then 
produces the cubic solved above. 
For Qy,9 X 0, s > 1, and we have the solution? 


1 
1-2 : 
Qh ,9 =4Qqm,0 | cosh 3 arccosh { ——™° 
Qm,0 


=f 3 Oy 6/D? = p,§ 430/23 F 2 (34) 


This gives the curve starting at the point (Qi, 2,0) = (0, 1) separating universes that banged from those that 
did not. Note that since s >> 1, implying 6 > 1, then a,,,, « 1 and the universe expands forever on both sides of 
this phase boundary. 

For Qmy,o X 1, we have the solution* 


1 
Qqu.9—-1 : 

Qa,o =4Q_qy,0 [ sin 3 arcsin Ps Es 
Qm,0 


= $(Qqn,o- I? - SQno- Di +--- (35) 


This gives the curve starting at the point (Q,,, 9, 2, ,9) = (1, 0) separating universes that will expand forever from 
those that will collapse. As indicated in the cosmic diagram, all these universes had a Big Bang in their past. 

I envisage a titanic struggle between dark energy and dark matter. Dark matter hardly has a chance against 
the explosively exponential expansion of a positive cosmological constant, but what we just learned is that for a 
tiny Q, 9 > 0, a sliver of opportunity still exists for dark matter. For example, (35) tells us that for Q,, 9 = 1.1, if 
QqAr,0 < x x 10-3, then dark matter could still reverse the expansion. 
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Note that before the discovery of dark energy, Q, 9 was presumed to vanish, and so cosmologists were 


restricted to the Q,, 9-axis. We can use the cosmic diagram to verify the results in the previous chapter. Thus, 
along the Q,,, 9-axis, for Qy, 9 < 1, the universe is open and expands forever, while for Q,, 9 > 1, it is closed and 
will eventually collapse. All these universes decelerate and had a Big Bang. 


Exercises 


1 


For a flat universe filled with only nonrelativistic matter, show that t,5. = 3H and p = 1/(67G??). 


2 Fora flat universe filled with only relativistic matter, show that f,5. = ih and p = 3/(327Gt?). 
3. Obtain Einstein’s static universe directly from his field equation and determine the radius of his static 
universe. 
4 Derive the expression for the curvature density used by astronomers 
2,0 
(2) = : = (36) 
Qm,o( + 2) + Q, 91+ 2)? + Qa 901+ 2)77 + Qe,0 
5 Show that a flat universe containing any amount of Q,, 9, but with Q, 9 < 0, will reach what is known as a 
Big Crunch, a moment when a vanishes. Calculate the time to the end. 
Notes 


1. What Time Is It over There? is an interesting film by Ming-liang Tsai. 
2. Note that if we define the Hubble length by L = H~! and regard it as some kind of radius of the universe, 


we can massage the definition of the critical density p, = ant into the following: 2G (47 L3/3)p, = L. If we 
think, rather loosely, of the universe as a Euclidean ball of radius L and mass density p,, and hence mass 
M = (41 L3/3)p,;, this tells us that the Schwarzschild radius of the universe is equal to its radius, that the 
universe is on the verge of being a black hole. This is of course just a heuristic way of interpreting what the 
critical density means. 


. For those who find this functional form indigestible, I offer (and prefer) the alternate form Q, 9 = 5 Qua, o(k + 


3k3 + 3k-3 +k), with k = (1— no + /T— Wm, 0)/Xm,o- 


. Again, as in the preceding endnote, I offer the alternate form Q, 9 = 5 Quy, o(W =F? 4377 = w}), with 


w= O/22m,0 — 1+ i(Qm,0 — D)/2m,0- 
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° of the Early Universe 


A physical history of the universe 


At one time, textbooks on Einstein gravity devoted! a substantial fraction of their exposi- 
tions to physical cosmology. In the intervening decades, cosmology has grown by leaps 
and bounds, and any serious discussion of the subject requires a textbook? of its own, as 
I’ve said in chapter VIII.1. Here I can only give you a sketch of the physical history of the 
universe, largely in qualitative’ terms. Thus, this chapter will consist of mostly talk. 

But as a subscriber to Feynman’s aversion to all talk and no action, I feel compelled to 
do one small calculation toward the end of this chapter. We will determine the position of 
the first acoustic peak in the fluctuation of the cosmic microwave background. 


The once-hot universe 


Imagine that someone had filmed the universe’s evolution. In what follows, we will play 
the movie backward and forward, rewind and fast forward. Let us start by playing the movie 
backward, starting from the present. 

The crucial feature of our universe is that it expands and hence cools. As was derived 
in chapter VIII.1, the temperature of the radiation filling the universe varies with the 
cosmological scale factor a(t) like T ~ 1/a, so that, as we go back into the early universe, 
T steadily increases. Yesterday was hotter than today by about 10~!!%, not by much, but 
a billion years ago, the universe was hotter by 7%. And so it goes. As Boltzmann and 
others taught us, temperature measures the average energy of the particles in a thermal 
distribution. Keep in mind that 1eV ~ 1.16 x 10* K, and so the photons in the cosmic 
microwave background, nowadays at 2.7 K, are almost negligibly feeble on the scale of 
atomic physics. But as we go back in time, these now frail photons once rampaged, ripping 
atoms apart. 
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Figure 1 In the primeval universe, photons rushed about 


vigorously trying to prevent the electrons from attaching 
themselves to the protons. A pictorial representation. 


Next, play the movie forward, starting at a time when the temperature was way too hot 
for atoms to exist. You might recall that the binding energy of the hydrogen atom is 13.6 eV. 
So let us suppose that the typical energy of the photons far exceeds that. The universe was 
a hot soup of electrons, protons, and so forth. (As we will see later, there were a small 
number of deuterons, helium nuclei, and the like around, but to keep the story simple, we 
will ignore them.) 

The photons are constantly scattering off the charged particles: the photons and the 
charged particles are said to be tightly coupled to each other. Occasionally, a proton and 
an electron would come together and form a hydrogen atom, but almost immediately, a 
photon would come along and knock the proton and the electron away from each other. 
See figure 1. 


Recombination delayed and decoupling 


But the universe keeps on cooling, and the average energy of the photons keeps on 
dropping. Eventually, the photons become too weak to ionize the hydrogen atom, at which 
point, protons and electrons rush to pair off with each other. 

This milestone in the life of the universe, the combination of protons and electrons into 
hydrogen atoms, is known as recombination.* You might think that the recombination 
would occur as soon as the temperature drops below ~13.6 eV, but two physical effects 
delay recombination. When we specify the temperature of a photon gas, we are talking 
about the average energy of the photons. The photons in the tail end of the thermal dis- 
tribution are much more energetic (by definition of “tail end”). Furthermore, the universe 
contains vastly more photons than protons and electrons, by a factor of about 101°. Thus, 
through the sheer numbers of photons, even if only a tiny fraction of them have ener- 
gies in excess of 13.6 eV, the hydrogen atoms are still ripped apart as soon as they form. 
Consequently, recombination does not occur until the universe has cooled to about 0.3 eV. 


* Recombination surely ranks as one of the least appropriate physics terms: the protons and the electrons had 
never been combined earlier. 
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Since the photon interacts oppositely with the positively charged protons and the nega- 
tively charged electrons in an atom, its interaction with an atom is vastly reduced compared 
to its interaction with the proton or with the electron. (The net interaction with the atom 
does not add up to exactly zero, because the electrons are spread out, while the protons are 
concentrated in the nucleus.) 

After recombination, the mean free path of the photons increases dramatically. There are 
hardly any charged particles for the photons to scatter off, and the interaction of photons 
with the atoms, as explained above, is relatively weak. Soon, the mean free path of the 
photons exceeds the Hubble radius of the universe, and the photons are effectively free. 
They are said to decouple.* 


Pale fire 


Decoupling was a crucial event in the evolution of the universe, as was pointed out in 1948 
by George Gamow, Ralph Alpher, and Robert Hermann. They realized that a pale shadow 
of the fire that filled the early universe should still be visible. 

After decoupling, radiation (photons) and matter (atoms) more or less go their separate 
ways. These primeval photons interact so little with matter that through the eons, they 
merely drift through the universe, getting redshifted down to a cosmic microwave back- 
ground permeating the universe. Gamow and his collaborators proposed the detection of 
these “relic photons” as a crucial test of the Big Bang. Finally, in 1965, Arno Penzias and 
Robert Wilson of Bell Telephone Company fortuitously detected these telltale photons. 

That the actual detection of these primeval photons was not made until 17 years after 
the initial prediction, and then only by chance, poses something of a puzzle for historians 
of physics. The technology was available in 1948. Why then had experimenters not tripped 
over one another to look for the glow from the Big Bang?* 


Primeval nucleosynthesis 


As we go farther back in time, the universe gets even hotter. When the average energy gets 
to about one tenth of an MeV, nuclei are ripped apart. (Again, when the average energy 
of the particles is only one tenth of an MeV, quite a few photons already have energies in 
excess of a few MeV.) The universe was too hot for nuclei to exist and consisted of a hot 
cauldron of protons, neutrons, and electrons. 

Now run the movie forward. The universe cools. Soon photons no longer have enough 
energy to break up a deuteron, so that the deuteron, once formed, could stick around. 
The number of deuterons increases rapidly. Violence fades as the universe ages. As the 
deuterons drift around, some of them are hit by protons and neutrons. When a proton 
sticks to a deuteron, a helium 3 nucleus results. When a neutron sticks to a deuteron, a 


* Recombination and decoupling happen to occur at roughly the same time in our universe, but the events 
are conceptually distinct. 


518 | VIII. Introduction to Our Universe 


tritium nucleus results. And so on and so forth. The net result is that as the universe cools, 
protons and neutrons stick to one another. The primeval soup of protons and neutrons is 
converted into nuclei of various kinds, laying the foundation for the modern world. The 
construction of atomic nuclei in the early universe, known as primeval nucleosynthesis, 
is easy to understand qualitatively. 


The Gamow principle 


In 1948, the U.S. government declassified nuclear reaction rates—information about how 
readily a proton or a neutron would stick to a given nucleus to forma larger nucleus. George 
Gamow realized that with this information, he could calculate the relative abundance of 
the elements in the universe. 

[refer to this insight as the Gamow principle: If you understand the physics at the energy 
scale E, then you can describe the evolution of the universe at temperature E. 

As we travel back to the early universe in our mind, we go through the standard 
curriculum of physics. After atomic physics comes nuclear physics. After nuclear physics 
comes particle physics. After the known particle physics comes grand unified physics, 
applicable when the universe was a soup of quarks, leptons, and grand unified gauge 
bosons. After that comes string theory. 

An obvious corollary follows: If you don’t understand the physics at a given energy scale, 
then you can only speculate. For example, if you think that you understand physics above 
the Planck scale Mp ~ 101° GeV, then you could describe trans-Planckian cosmology. But 
if you don’t, then all the talk amounts to mere speculation. 

Perhaps paradoxically to the layperson (but not to you), the later the epoch, the more 
detailed and involved the physics you would have to master to work out the cosmology of 
the epoch. For example, around 100 million years after the Big Bang, hydrogen molecules 
began to form, and it is essential to understand the difference between the excitation spec- 
trum of the hydrogen molecule versus that of hydrogen atoms. Much of what happened 
since then has to be worked out by massive numerical computation.° 


Stellar nucleosynthesis 


The universe is a spiraling Big Band in a polka-dotted speakeasy, 
effectively generating new light every one-night stand. 


—Ishmael Reed 


Gamow originally thought that all nuclei were formed in primeval nucleosynthesis. It 
later became clear, however, that nucleosynthesis essentially came to a halt shortly after 
helium was formed. By that time, the expansion of the universe had reduced drastically 
the numbers of protons, deuterons, and helium nuclei per unit volume. The collisions 
between them were so infrequent that nuclear processes by and large came to a halt. 
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But as the universe cools further, the electrons move ever more slowly. With the passing 
of time, the universe becomes cool enough for the products of nucleosynthesis—the 
protons, deuterons, and helium nuclei—to grab some passing electrons to form atoms. 
After atoms form, the nuclei can no longer get close to one another. When two atoms get 
close, the buzzing clouds of electrons keep the two nuclei far apart. 

After primordial nucleosynthesis, the universe thus settles into a relatively dull exis- 
tence, permeated by enormous clouds of gas, cool enough for atoms and molecules to 
exist. However, gravity has already been hard at work, pulling together neighboring globs 
of gas. Soon the first stars condense out of the primeval gas clouds. As the gas atoms rush 
together to form a star, they crash into one another with such abandon that they rip elec- 
trons off one another, thus allowing the nuclei to approach one another once again and 
restart nuclear reactions. The universe is suddenly lit with lights beyond measure. 

Inside the stars, a helium nucleus bumps into another helium nucleus, which stick to 
each other to form a beryllium nucleus. Yet another helium nucleus wanders by, sticks 
to the beryllium nucleus, and produces a carbon nucleus. Out of starfires we humans 
become a possibility. Note the crucial difference between primeval nucleosynthesis and 
stellar nucleosynthesis. In the primeval setting, nuclei were drifting farther and farther 
apart in the expanding universe. But when they were confined inside stars, they were 
bound to bump into one another. Thus deuterium, helium, and a little bit of lithium 
were produced in the primeval universe, while the more massive nuclei were formed 
later in stars. When some of the first-generation stars exploded, they ejected into space 
these higher nuclei, among other things. Out of this ejected debris, a second generation 
of stars soon condensed. These stars started out containing heavier nuclei like carbon, out 
of which more and more complicated nuclei are manufactured. Eventually, these stars in 
turn exploded and splattered themselves over the cosmos. 

You can’t make much out of only hydrogen and helium, but with carbon, silicon, iron, 
and so forth, the possibilities become endless. You can make rocks, for instance. Bits of 
rocks come together to form rock piles, laughably minute specks of dust in the cosmic 
scheme of things. On one of these specks, carbon atoms started connecting up with 
hydrogen, oxygen, and so forth. Somehow, these bunches of atoms suddenly came alive. 
Eons and lots of self-improvement courses later, this moving, eating, reproducing ooze 
turned into what are known as human beings, who eventually end up writing and reading 
textbooks on Einstein gravity. 


The rich get richer 


Gravity plays an all-important role in the formation of structures in the universe. 

Let me first tell the story without dark matter. 

About 40,000 years after the Big Bang, bits of matter started to come together, forming 
enormous structures that eventually condensed into galaxies. Galaxy formation marked the 
first step in the emergence of structures in our universe: Within the galaxies, protostars 
soon formed. 
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Long ago, Newton had already identified the basic physics responsible for the emergence 
of structure: the inherent instability of gravity. In 1692, a certain Reverend Dr. Richard 
Bentley wrote to Newton, arguing that the universal presence of gravity proved the exis- 
tence of God, a view with which Newton was much in sympathy. A lively correspondence 
followed. In one of his letters to Bentley, Newton suggested how structures could emerge 
in the universe, as follows. 

Imagine space filled uniformly with matter. Newton made the point that any irregularity, 
no matter how minute, would grow larger. Consider a region with more matter per unit 
volume than the surrounding regions. Being denser, the matter in this region would pull 
matter in from the surrounding regions by the force of gravity. As a result, the matter 
distribution in this region becomes even denser. The process accelerates—it is the cosmic 
equivalent of the often-observed phenomenon that the rich get richer and the poor get 
poorer.* “And thus might the sun and fixed stars be formed,” concluded Newton. (Galaxies 
were unknown in Newton’s time.) 

Newton’s scenario® makes such obvious sense that it remains the basic explanation of 
how structures emerged in the early universe. Small fluctuations in the density of matter 
grew and became amplified. 

In its intrinsic instability, gravity is dramatically different from the other forces. The 
electromagnetic force, the other long-ranged force in Nature, is intrinsically stable, because 
it acts oppositely on positive and negative charges. To see this stability, consider a gas 
consisting of equal numbers of protons and electrons. An excessive concentration of 
electrons in one region would be immediately smoothed out by the mutual repulsion 
among the electrons. Unlike gravity, the electromagnetic force tends to smooth matter 
out, counteracting gravity’s tendency to dump matter together. 


Dissipative collapse 


In 1902, the English physicist Sir James Jeans tried to calculate the size of the actual lumps 
that would form. However, he did not know about the universe’s expansion. Clearly, cosmic 
expansion, by thinning out the distribution to matter, works to slow down the formation 
of lumps. In our analogy, cosmic expansion acts like taxation: as the rich get richer, part of 
their wealth is continuously removed. But the tax rate is more or less flat: regions sparse 
with matter are stretched out at essentially the same rate as those regions dense with 
matter. A calculation’ including the effect of cosmic expansion, first done by the Soviet 
physicist E. Lifschitz in 1946, shows that lumps will still form but at a far slower rate than 
would have been the case were the universe static. With taxation, the rich continue to get 
richer; they are merely slowed down. 


* Introduced into sociology as the Matthew principle by R. K. Merton. We already cited this principle in 
chapters III.2 and V1.3. 
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As lumps of matter form, more atoms rush in toward the nearest lump. As they rush in, 
they collide with one another, emitting photons that, since they hardly interact with atoms, 
escape from the mad rush, thus carrying away energy. In this way, the atoms lose energy 
and move ever more slowly, less and less able to resist the inward pull of gravity. And 
thus matter collapses into increasingly compact lumps, in a process known as dissipative 
collapse.* That atoms can radiate photons and dissipate energy is essential to the story. 
Were there no mechanism for dissipation, the kinetic energy of the atoms would prevent 
them from collapsing into lumps. They would simply bounce off one another. 


Without gravity, we would not be 


Thanks, gravity. How wonderful gravity is! Without it, we would not be. The universe 
might have been a thin haze without much to recommend it. But gravity couldn't have 
done it alone. To me, it is awe inspiring how only the intricate intertwining of all four 
forces manages to bring it off. As gravity strives to bring structures out of the haze, the 
electromagnetic force is needed to carry the excess energy away. Once the particles quiet 
down and gravity brings the primeval hydrogen and helium nuclei face to face, the strong 
and the weak forces step in. The strong force causes nuclei to react with one another, thus 
igniting the nuclear fire that brings warmth to the vast void. The weak force is crucial lest 
stars become as uncivilized as nuclear bombs. Certain nuclear reactions can only proceed 
through the weak force. Because the weak force is, well, weak, these reactions proceed 
extremely slowly. As a result, the nuclear fires in stars burn at a stately pace. Meanwhile, 
gravity is busily collecting the ejecta left by the dying stars of the first stellar generation 
into planet-sized bits of interstellar dirt. The electromagnetic force is keeping busy, too. 
It is transporting energy from the stars to warm these bits of dirt, and it is running all 
kinds of chemical reactions, bonding one atom to another, so more and more interesting 
structures can be built. It’s a team effort. 


The problem of not enough time 


The rich get richer, but there still is a problem that they often feel acutely: it takes time 
to become rich. Similarly, it takes time for the density fluctuations to grow to the point 
when structures can form. As explained above, we know when the fluctuations can start 
growing (not until recombination) and how fast the fluctuations grow once they get going. 
But in addition, we also know how large the initial density fluctuations were, thanks to 
the cosmic microwave background. The photons that compose the cosmic microwave 
background, having traveled unperturbed since the time of decoupling, tell us about the 
matter distribution at the time. 


* For an illustration of dissipative collapse, see figure 8.3 on p. 129 of Toy/Universe. 
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Picture the primeval fluid sloshing back and forth in the universe. Photons leaving a 
denser, and hence hotter, region would be more energetic, and photons leaving a less 
dense region would be cooler. Thus, any density fluctuation in the universe at the time of 
decoupling would be imprinted as a temperature fluctuation in the observed microwave 
background. Indeed, at one time, this line of reasoning was used to predict the temperature 
fluctuation in the microwave background. 

We are going to discuss one aspect of this temperature fluctuation in some detail later, 
but we might as well introduce the notation now. Let T(7) = T(@, g) be the observed 
temperature of the microwave background in the direction n defined by the angles 6 
and ¢ in spherical coordinates. As mentioned earlier, the average temperature (T) = 
x f{ d0dy sin @ T(@, y) was measured to be ~2.725 K. Consider the fractional deviation 
from the mean ee (n) = (T(”) — (T))/(T). One measure of the fluctuation is given by the 
root-mean-square of the fractional deviation (es )*), which was observed to be ~10~°. This 
observed value turned out to be significantly less than the predicted value, thus indicating 
that the universe at decoupling was smoother than previously thought. But if the density 
fluctuation at recombination started small, there was not enough time for it to grow into the 
lumpy universe we know today. At one time, this posed a serious problem for cosmology, 
a problem now resolved, as we will see presently. 


Dark matter 


Thus far, I have told the story of the early universe without dark matter. As was mentioned 
earlier on various occasions, observational evidence indicates that the universe contains a 
lot of dark matter particles, sometimes known as wimps.* For example, stellar movements 
indicate that the dark matter in various galaxies outweighs the total collection of stars in 
those galaxies. This astonishing conclusion completely revises our picture of galaxies. The 
luminous matter we know and love, consisting of nucleons and electrons, is now seen as 
bits of flotsam bobbing about in a sea of dark matter. 

Dark matter now comes to the rescue of cosmologists perplexed by the “not enough 
time” problem. By definition, dark matter does not interact electromagnetically. Since the 
wimps do not interact with photons, unlike the charged protons and electrons, they can 
start coming together gravitationally long before decoupling. Regions that happen to have 
a somewhat denser distribution of wimps could start getting even denser by pulling in 
wimps from the surrounding regions through gravity. 

Meanwhile, the photons ignore the wimps as they struggle mightily to smooth out the 
distribution of ordinary matter. All that labor would prove to be in vain. The wimps are 
already condensing into lumps of dark matter, which tugged at the ambient ordinary matter 
through gravity, urging the protons and electrons to fall in. As soon as decoupling occurred, 
atoms formed, and ordinary matter, now neutral, fell into the ready-made dark matter 


* An acronym for “weakly interacting massive particles.” It should be mentioned that as of this writing, the 
actual particle (or particles) that dark matter consists of has not been directly observed and identified. 
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lumps. With dark matter present, the formation of structures in the universe could start 
earlier, without waiting for the formation of atoms. 
Not only do the rich get richer, but the rich also can count on inheriting from the wimps. 


First acoustic peak 


After all this talk, let us do an actual calculation, as promised. The correlation function 


defined by® 


6T ,, , 6T 


CO)= (2 (A) T (A) (1) 


). O=ny-n2 


evidently measures how the fractional deviations in temperature in two regions of the 
sky separated by angle @ are correlated. We can expand this angular function in terms of 
Legendre polynomials: C(@) = aa pee C, P;(cos @). 

The observational data, with C; plotted against / (which you can think of roughly as the 
conjugate of 9), is shown in figure 2. Earlier you were asked to picture the primeval fluid 
sloshing back and forth; hence the peaks in the plot are known as acoustic peaks. Note the 
position of the first peak at /, ~ 180, corresponding to the angle 6, = 2/1). 

Since the smaller values of / correspond to larger values of 0, the value of 6 tells us about 
the maximal angular size of the primeval density fluctuations. Let us use this observation 
to estimate the position of the first acoustic peak. We will do it for a flat universe for 
simplicity, and come back later to the question of how the curvature of the universe affects 
the position. 

Back in school, we learned that the angular size of an object is given by 0 = 4/d, where 
A denotes the linear size of the object, that is, the distance between its two ends, and d the 
distance from us to the object along the line of sight. Thus, to determine 6, and hence /,, 


Qe + 1)C, 


10 100 1,000 
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Figure 2 Schematic representation of the observational data 
on angular correlation in the fluctuations of the cosmic 
microwave background; see the text for details. 
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our task reduces to calculating two distances. Since we are calculating a ratio 1/d, we can 
afford to be sloppy and drop common overall factors. 

As explained in detail in chapter V.3, distance should be defined operationally by bounc- 
ing light back and forth off the object of interest, as in radar ranging. For flat (k = 0) 
spacetime, the relevant integral for the coordinate distance between two events 1 and 2 is 
given by 

@=Rn=f dr=c [ i ac [Seca (2) 


r 4 a(t) , aa 


(see (V.3.10), with the notation used there). Note that we have restored the speed of light. In 
chapter VIII.1, we learned that a(t) « 13 during the matter dominated era, and a(t) « 12 


during the radiation dominated era. Write a xt”, andhence a «t’~!«a 7 .The quantity 
Jz defined in (2) for convenience thus turns out to be 


2 dq ly aSY, ky 
Ine f —ay «la,” —a," (3) 
a @ 


During the matter dominated era, 712 « (3 - a?) and during the radiation dominated 
era, 742 « (dy — a}). 

Recall from chapter VIII.2 that the scale factor a is related to the redshift z by a= 
1/(1+ z). The relevant numbers we need are Zydom X 8,800 when matter started dom- 
inating over radiation, zge, < 1,100 when radiation decoupled from matter, and of course 
zo = Oat present. Or equivalently dy = 1, dgec ~ 1073, and dmdom © 107*. Since dp >> gece > 
Gmdom the evaluations of d and 4 simplify, as we will see presently. 

Now that I have set things up and told you the relevant numbers, we are ready to 
calculate. First, let us calculate the coordinate distance d between decoupling and the 
present. Since dark energy is only starting to take over now, it is an excellent approximation 


in calculating d to assume a matter dominated universe in the eons between decoupling 
1 1 1 


and now. Thus, we have d « (a3 - az.) Stars 

The calculation of A, is slightly more involved. The maximum size of the density 
fluctuation is limited by the distance that sound could have traveled since the Big Bang, 
and hence by the speed of sound c,. 

Normally, when we think of a sound wave, it can have any wavelength we like, but for 
matter to oscillate together as a density wave, information has to be conveyed from one end 
of the fluctuation to the other. In ordinary circumstances, compared to the characteristic 
time scale of oscillation, a sound wave can be taken to have existed forever; that is, it can 
be treated as a standing wave. In early cosmology, we have the extraordinary situation that 
time itself had started only a little while earlier. Sound, or more accurately a density wave, 
could not have gotten farther than a certain maximum distance* determined by the time 
elapsed since the Big Bang. 


* What we are discussing here is known as the sound horizon. The concept of a light horizon will be discussed 
in the next chapter. 
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Hence, the maximum size of the fluctuations in the primeval fluid when the cosmic 
microwave photons decoupled is given by 4, = c, [y** a =c, fo% 4. The crucial point 
is that the integral here is multiplied by c,, rather than c, as in the calculation of the 
distance d. 

This integral for 4, is naturally divided into two pieces according to the behavior of a: 
(a) from the Big Bang until matter dominance (call it BB to MD), that is, from ¢ = 0 until 
tmdom and (b) from matter dominance to decoupling (call it MD to DEC). Using our results 
for 7,2 during the matter dominated era and during the radiation dominated era, and the 


numerical values of a given earlier, we see that the piece (MD to DEC) contributes much 
1 1 1 1 


z I woe A 
more to the integral than does the piece (BB to MD), since aj.. — G5 am ~ dec > Gindom > 
Amdom: 


Putting it altogether, and using c, = c/./3 from chapter III.6, we obtain 


1 
dec nm 
1 

2 3d + Zdec) 


In other words, 1, = 1/0; ~ 7,/3(1 + Zgec) X 180, in excellent agreement with the obser- 
vational data shown in figure 2. 


Effect of curvature on fluctuations in the cosmic microwave background 
We did the calculation for a flat universe, but we could easily take into account the effect 


of spatial curvature. (Recall exercise 6 in chapter V.3.) Since curvature hardly plays a 
role in the very early universe, its effect is mainly in the calculation of d. Instead of 


d=R=f" dr=c ina a =cJ as in (2), we have for the closed universe 


R 
/ open =F 
0 /y_ 2 
L2 

so that d= R=Lsin(cJ/L) <cJ. Thus, d is smaller, so that 6, is larger. Hence in a 
closed universe, /, is smaller than what it would be in a flat universe. 

This effect is also easy to understand pictorially. Think of yourself at the north pole 
looking at a stick aligned east-west at some latitude. Picture the two geodesics reaching 
you starting at the two ends of the stick. The angle between the two geodesics is larger 


than it would be were the earth flat. See figure 3. 


Figure 3 The effect of curvature on the 
position of the first acoustic peak. 
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For the open universe, replace the integral 


/ R dr a / Rae 
0 [1 — 5 0 /1+ 5 
or effectively sine in the preceding paragraph by hyperbolic sine. Hence in an open 


universe, /; is larger than what it would be in a flat universe. 
We conclude that 


1,(open) > 1, (flat) > 1, (closed) (5) 


The first acoustic peak is shifted to a smaller value of / for a closed universe, and to a larger 
value of / for an open universe. The observational data on the position of the first acoustic 
peak favor a flat universe. 


Appendix: Baryogenesis and leptogenesis 


In this appendix, we wade into particle physics, and the discussion will get somewhat involved. It is fine if you 
choose to skip over this. 

Imagine playing the cosmic movie backward from the time when the universe consists of a soup of protons, 
neutrons, and electrons. Soon, the protons and neutrons in their turn are dissociated into quarks and gluons 
(the “cousins” of the photon, which are responsible for the strong interaction), and a knowledge of quantum 
chromodynamics, the theory of the strong interaction, is needed to work out the physical cosmology of this 
epoch, in accordance with the Gamow principle. 

At this point, you might well wonder where the quarks come from. While you are at it, how about the electrons? 

To answer these questions, we will have to venture into more speculative areas of particle physics. Since the 
requisite knowledge? lies far outside the scope of this book, I will have to give an exceedingly brachylogous!° 
account meant only to give you an overview rather than understanding. 

Protons, neutrons, and their various cousins are called baryons and carry what is known as baryon number 
B. Similarly, electrons, electron neutrinos, and their various cousins are called leptons and carry what is known 
as lepton number L. After much travail, particle physicists now understand that each baryon is made out of three 
quarks. Since baryon number is additive, each quark carries baryon number ; 

For a long time, it was believed that baryon number B and lepton number L are separately conserved. We 
now even understand why. When an electron emits or absorbs a photon, it remains an electron. Similarly, when 
a quark emits or absorbs a photon, it remains a quark, in fact, exactly the same kind of quark. In other words, 
the photon does not change the charged particle it interacts with. In contrast, the W boson, the cousin of the 
photon that is responsible for the weak interaction, changes the electron into a neutrino, and a down quark into 
an up quark, and so on. But in these weak interaction processes, B and L are still separately conserved. In the late 
1960s, it was realized that the electromagnetic interaction and the weak interaction could be unified into a single 
electroweak interaction. Meanwhile, the gluon, which I already alluded to as being responsible for the strong 
interaction, changes an up quark of one color into an up quark of another color, and hence does not change the 
baryon number B. (I am assuming that you have heard somewhere that quarks carry a quantum number particle 
physicists call color, hence the name quantum chromodynamics for the theory of the strong interaction.) The 
gluon does not touch leptons at all. 

What I have done here is outrageous, brushing over entire books on particle physics in a couple of short 
paragraphs.!! Lest the reader lose sight of what we are doing, let me recap. Very roughly speaking, we would like 
to understand why the universe contains this many electrons, this many protons, this many neutrons, and so 
on. But electron number and proton number are not conserved. As a specific example, the neutron decays into 
a proton, an electron, and an antineutrino of the electron type (written as n > p+ e~ + 0,). The antineutrino 
is assigned lepton number —1 (so that the neutrino has lepton number +1). In this process, neutron number 
changes from 1 to 0, while proton number changes from 0 to 1. In contrast, we start out with B = 1, L = 0, and 
end with B = 1, L= 0. Thus, it is not particularly useful to talk about electron number, proton number, neutron 
number, and the like, since they are liable to change, but it does make good sense to count with baryon number 
B and lepton number L. 
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At this point, the astute reader surely realizes that, unless the universe is finite in extent, it is more appropriate 
to talk about baryon number density ng (namely the number of baryons per unit volume) and lepton number 
density n,, rather than the total number B and L. Furthermore, it is more convenient to talk about the dimen- 
sionless ratios ng/n, and n,/n,,, since n,, is easily determined by the well-measured temperature of the cosmic 
microwave background. 

Incidentally, n,, provides a good measure of the entropy of the universe. Recall that I have already quoted 


Y 


the observed value n/n, ~ 10~1; in other words, the matter we know and love is truly a pitifully small 
contamination of an otherwise pristine universe. Equivalently, we could also say that the entropy per baryon 
is huge. 

So, to understand the matter content of the universe, we would like to be able to calculate these two 
cosmological quantities ng/n, and n,/n, from scratch. But starting with what? 

Good question. The most appealing supposition is to imagine that right after the Big Bang, the universe is 
pristine, with B = 0 and L = 0. The observed baryons and leptons are somehow generated later. 

But, thus far in the story, B and L are separately conserved. Hence, if the universe started out with B = 0 and 
L = 0, it would always have B = 0 and L = 0. In other words, as long as B and L are conserved, the baryons 
and leptons of the universe have to be put in at the beginning: they are part of the initial conditions. While this 
possibility may be theologically appealing to some people, the typical physicist would much prefer to be able to 
calculate as many observed quantities as possible. 

Thus, for an ab initio calculation of the baryon and lepton number of the universe at present, a necessary 
ingredient is baryon and lepton number nonconservation. 

Now again a lightning summary of the relevant particle physics. Until the early 1970s, particle physicists 
thought of the four fundamental forces—the strong, the weak, the electromagnetic, and gravity—as unrelated. 
First, the electromagnetic and the weak interactions were merged into a single electroweak interaction, as 
mentioned above. Later, the strong and the electroweak interactions were further unified into a single interaction, 
known as the grand unified interaction, with a characteristic energy scale of about 1016 GeV. In other words, the 
grand unified theory predicts that in processes in which particles with energies of about 10!° GeV collide with 
one another, the strong, the weak, and the electromagnetic forces become the same force. While experimental 
confirmation remains lacking at present, there are various compelling theoretical reasons for believing in the 
grand unified theory.! 

So once again, we invoke the Gamow principle: if we think that we understand grand unified theory, then we 
can discuss the universe in the grand unified era, when the temperature was of order 101° Gev. 

Reading the preceding description of the strong, the weak, and the electromagnetic interactions, you may 
have realized that, with these interactions, quarks are always transformed into quarks and leptons always into 
leptons. Before grand unification, you could take a piece of paper, draw a line down the middle, and write down 
the names of all the quarks on the left side of the line, and the names of all the leptons on the right side of the 
line. The fundamental forces act on these particles, the quarks and the leptons, transforming one particle into 
another. But in all these transformations, a particle on one side of the line is never changed into a particle on the 
other side. A quark is never transformed into a lepton, and a lepton is never transformed into a quark. 

Grand unification erases the line drawn on that piece of paper. The worlds of quarks and leptons can no longer 
be separated; the two worlds are unified. With grand unification come new transformations that change quarks 
into leptons and vice versa.* 

With baryon and lepton number nonconservation, it is now possible to start the universe with B = 0and L = 0, 
and then through various processes during the grand unified era to generate nonvanishing B and L. After the 
grand unified era, the universe is too cool for these baryon and lepton number nonconserving processes to 
proceed, and we end up with the number of baryons and leptons that are observed at present. 


* In Fearful, 1 spoke of a magician whose art is limited to transforming one animal into another animal, 
one fruit into another. A rabbit and an apple are on the stage. The magician waves his cape, and, whoosh, 
the rabbit and the apple are transformed into a fox and some sour grapes. The audience bursts into applause. 
Whoosh! The fox and the grapes are gone, replaced by a mouse and a watermelon. But no matter how fantastic 
the transformations, there always will be one animal and one fruit on stage. So, too, the fundamental forces 
can only transform one quark into another quark, one lepton into another lepton. You may recognize that this 
implies baryon conservation: the three quarks that made up a baryon can be transformed only into three other 
quarks. There always will be three quarks, just as there always will be one animal on stage in the analogy. Onto 
the stage struts a new magician, the mysterious and amazing Mr. Grand Unification. Applause, and whoosh! 
The rabbit is transformed into an orange. No more animal on stage. So too in grand unified theory. Whoosh! No 
more baryon on stage. The three quarks inside a proton can be transformed into leptons. Baryon number is no 
longer conserved, and the proton can decay. 
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At this point, we realize that there is another problem. 

Back in chapter III.4, I mentioned one triumph of special relativity: when combined with quantum physics, 
it necessarily leads to the existence of antimatter. For every particle, there is a corresponding antiparticle. Again, 
in chapter VII.3, in discussing Hawking radiation, we talked about an electron and a positron (the antielectron) 
popping out of the vacuum. Similarly, a proton and an antiproton can pop out of the vacuum. Particle physicists 
define the antiproton to have B = —1, and the positron to have L = —1, so that these processes in which pairs of 
particles and antiparticles pop out of the vacuum conserve B and L. 

Furthermore, it was discovered that the fundamental laws of physics are invariant under an operation (known 
as CP: charge conjugation followed by parity) that interchanges particles and antiparticles. In other words, physics 
does not favor matter over antimatter, and vice versa. 

So, here is the problem. Seen in this light, the problem of understanding the baryon and lepton content 
of this universe becomes a problem of understanding the matter-antimatter asymmetry of the universe. Why 
does matter dominate over antimatter in our universe? In other words, starting with B = 0 and L = 0, how 
would the universe know to generate a positive baryon number, rather than a negative baryon number? Similarly 
for the lepton number. If we truly understand what is going on, we should be able to calculate the sign as well 
as the magnitude of ng/n,, (and similarly for n; /n,,). Therefore, to cook up some baryons and leptons for the 
universe, we have to introduce yet another ingredient into the story: we need to invoke physical processes that 
would favor matter over antimatter. 

Hence I need to mention one more thing about particle physics. In 1964, the belief that the laws of physics must 
be invariant under CP was shattered by J. Cronin, V. Fitch, and collaborators.!3 They found experimentally that, 
in the decay (mediated by the weak interaction) of certain mesons, particles and antiparticles behave differently 
by a tiny amount. 

With CP violated experimentally and B and L violated theoretically, we finally have all the ingredients to 
generate the baryon and lepton content of the universe during the grand unified era.'* The details do not concern 
us, but the relevant processes involve the decay and interaction of various cousins of the photons present only 
in the grand unified theory. Conceptually, our ability to calculate the matter content of the universe is not much 
different from our ability to calculate the hydrogen and helium content of the universe, but it is much shakier 
in accordance with the Gamow principle. 

Tend this rather long appendix by mentioning another twist to this story. Later, it was discovered theoretically 
that the electroweak interaction also violates B and L, through nonperturbative effects (that explains why it was 
not known earlier), but conserves B — L. Hence, people now claim that the baryons and leptons generated in 
the grand unified era would get washed out in the electroweak era (that is, the universe would relapse back to a 
state with B = 0 and L=0). 

Instead, according to one scenario known as leptogenesis, during the electroweak era, processes involving 
neutrinos are supposed to generate L, which, since B — L is conserved, would also generate B as a kind of 
collateral damage. Some people swear by this leptogenesis scenario, but to some others, the original scenario of 
a grand unified birth seems much simpler and cleaner. 


Notes 


1. For example, S. Weinberg, Gravitation and Cosmology: Principles and Applications of the General Theory of 
Relativity, Wiley, 1972. 

2. S. Weinberg, Cosmology, Oxford University Press, 2008, and ibid.; V. Mukhanov, Physical Foundations of 
Cosmology, Cambridge University Press, 2005. 

3. Indeed, much of the exposition is adapted from Toy/Universe. The reader who is totally ignorant of cosmology 
might find this popular book an easy introduction to the subject. 

4. Steve Weinberg has given a fascinating analysis of this question. More often than not, history does not develop 
in a straight line. For one thing, Gamow botched the details of primordial nucleosynthesis. He arbitrarily 
supposed that the early universe contained neutrons but not protons. For this and other reasons, the Big 
Bang cosmology of Gamow was not taken seriously and gradually faded from the general consciousness. 
Penzias and Wilson were totally unaware of Gamow’s prediction that a faint glow from the Big Bang ought 
to be observable; they were trying to eliminate an annoying hum in an antenna they were working on. Quite 
remarkably, not 50 miles from them but unbeknownst to them, a group of physicists at Princeton University 
consisting of Robert Dicke, P. G. Roll, and David Wilkinson, were setting up an experiment to detect whether 
the universe had once been hotter. They had also forgotten Gamow’s calculation. At Dicke’s suggestion, a 
young theorist named James Peebles worked out primeval nucleosynthesis all over again. He was thus able to 
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predict the expected average energy of the microwave photons. Unfortunately for the Princeton group, they 
were beaten to the punch. When they heard about the persistent hum picked up at the telephone company lab, 
they were thunderstruck and immediately realized the magnitude of what Penzias and Wilson had discovered 
serendipitously. 


. For an easy account of what happened to the universe starting from about 400,000 years, see T. Abel, Physics 


Today, April 2011, p. 51. 


. Incidentally, Historians know about Newton’s letters because in 1756, Bentley’s heirs published them under 


the title “Four Letters from Sir Isaac Newton to Doctor Bentley Containing Some Arguments in Proof of a 
Deity.” 


. For an easy pedagogical introduction to structure formation, see A. Zee, Unity of Forces in the Universe, 


volume II, chapter XII. A simple version of the calculation referred to here is given in the appendix. 


. The angle brackets denote averaging over an ensemble of realization of the fluctuation. There is a slight 


subtlety here, since we have only one universe. In practice, °” (f) is expanded into multiple moments and the 
average is taken over different azimuthal moments. We won't go into such details of observational cosmology 
in this text. 


. See, for example, QFT Nut, part VII. 

. Recall the brachistochrone problem from chapter II.1. 

. For a more leisurely account at the level of popular physics books, see Fearful. 

. I discuss grand unified theory in considerable detail at the level of popular physics in Fearful. 

. Incidentally, both Jim Cronin and Val Fitch influenced my formation as a physicist. 

. The early work in the context of grand unified theory was done by M. Yoshimura, S$. Dimopoulos, L. Susskind, 


D. Toussaint, S. Treiman, F. Wilczek, A. Zee, and others. Later, it became known that in the Soviet Union, 
A. D. Sakharov had discussed the general framework for generating the baryon content of the universe in 
1967, long before the invention of grand unified theory. See A. Zee, Unity of Forces, chapter XI. 
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Traditional cosmology beset by problems 


Before the discovery of dark energy in 1999, itwas presumed that the cosmological constant 
was zero and cosmic expansion was driven entirely by matter and radiation. By the late 
1970s, it gradually became clear that Big Bang cosmology as understood at that time was 
beset by several serious problems, known as the horizon problem, the homogeneity and 
isotropy problem, the flatness problem, and the relic problem. We will first discuss these 
problems in turn! before turning to inflationary cosmology. 


Horizon problem 


At any given time t¢ after the Big Bang, light, even traveling at the universe’s ultimate speed 
limit, could not have gotten arbitrarily far. There had not been enough time to have gotten 
farther than a certain horizon? distance dhorizon(t) to be defined below. Thus, two points 
farther apart than dporizon(t) could not have been in causal contact with each other. Using 
the cosmic time 7 introduced in chapter V.3, we can make this starkly clear pictorially 
(figure 1). As in Minkowski spacetime, light moves along 45° lines. As indicated, events 
A and B are in the future light cone of O, and B and C are in the future light cone of P. 
Events A and B are causally correlated: events at np) could have influenced both of them. 
Similarly, events B and C are causally correlated, but not events A and C. I daresay that 
even the proverbial guy in the street could understand this point. 

Consequently, the early universe can be regarded as cut up into small patches called 
causal domains, with different causal domains uncorrelated with one another. True, the 
universe has expanded a great deal since then, and you might think that these primeval 
casual domains are now huge, but a power law expansion driven by matter and radiation 
proved to be insufficient. In this traditional Big Bang scenario, we would expect the present 
universe to look like patches of causal domains, rather than the homogeneous smooth 
universe that it actually is. 
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Figure 1 Causal domains in the early universe: Events A and 
B are causally correlated; similarly, events B and C are causally 
correlated, but not events A and C. 


Homogeneity and isotropy problem 


The universe is homogeneous and isotropic, and that’s of course why cosmic expansion 
can be studied so simply with an ordinary differential equation, as in chapter VIII.1. But 
the universe is way too homogeneous and isotropic! 

As mentioned in the preceding chapter, the temperature variation of the cosmic mi- 
crowave background amounts to only (4) ~ 10~° across the sky. How can causally 
uncorrelated domains end up having almost the same temperature? The size of 
these patches should be determined by d):izon(t) at photon decoupling, because ever 
since that time, photons have been streaming freely. How did the universe get to be so 
smooth?* 

Before the late 1970s, some physicists would argue that this homogeneity problem is a 
matter of the initial conditions the universe started with, and hence outside the purview 
of theoretical physics. 


Flatness problem 


From (VIII.2.29), we learned that Q,, which measures the curvature of the universe, 
evolves according to Q, = HQ, (2Q, + Qm — 2Q,). At one time, it was thought that Q, is 
strictly zero, and so 


4, = H2(2Q, + Qn) (1) 


Since (2Q, + Qm) > 0, any deviation away from Q, = 0 is unstable. As long as Q, is 
not 0, its magnitude would grow regardless of its sign. How did the universe get to be 
so flat? 


* As explained also in the preceding chapter, the homogeneity problem implies there was not enough time 
for the irregularities at the time of decoupling to grow into the structures we see today. That problem was solved 
by dark matter getting a head start on growing structures. 
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Relic problem 


One consequence of the grand unified theory mentioned in the preceding chapter is that 
it should contain a certain number of magnetic monopoles and antimonopoles per unit 
volume left over as a relic of the grand unified era, at a level totally in contradiction with 
observation. Indeed, as you no doubt know, the magnetic monopole has never been ob- 
served. If we claim that we understand the grand unified theory, then the Gamow principle 
tells us that the absence of relics from the grand unified era poses a serious problem. After 
all, the relic from the atomic era, namely the cosmic microwave background, is well mea- 
sured. But a conservative would say that we understand atomic physics so much better 
than grand unified physics. 


Distance to the horizon 


Let us now calculate the distance to the horizon. In chapter V.3, we introduced the proper 


distance* 


d(t, R) =a(t) —_ 
ie —k = 

between two distant points with coordinates (t, 0,6, y) and (t, R, 0, y). Evidently, it is 

a function of two variables t and R. The horizon distance dporizon(t), a function of one 

variable, is defined by choosing R to be the coordinate distance light could have traversed 

by time f, starting at the Big pene. Since light rays follow paths determined by ds = 0, that 


is, by dt/a(t) =dr/,/1— the horizon distance is given by 


ky, 


(2) 


R 
dr 
Ahorzon(t) = att) f a Te aw fa 
0 L— kt 0 a(t’) 


In other words, it is the coordinate distance i dt' /a(t') expanded by the scale factor a(t) 
at time f. 

Recall also from chapter VIII.1 that a(t) « t”, with y = 5 for a radiation dominated 
early universe and y = 4 for a matter dominated early universe. (The overall constant 
A in a(t) = At” obviously cancels out in dporizon(t).) So, evaluate the integral in (2): 
I(t, to) = ff dt'/a(t') = A“ = y) “1a - 1”). 

For (1 — y) > 0, as is the case for the traditional radiation or matter dominated early 
universe, 7 (t, fg) receives most ofits contribution from late times. In fact, for our purposes 
here, we can let tg > 0, so that 


dporizon(t) = (1 — y)'t (3) 


grows linearly with time. 


VIII.4. Inflationary Cosmology | 533 


For (1— y) <0, that is, if y > 1, the situation is obviously reversed: the regime around 
to contributes the most to J(t, to). Evidently, the steep decrease of the integrand as 
time increases is responsible. In the extreme case, a(t) xe”! ~ t®, J(t, to) = (e7 4 — 
e—"')/H so that 


1 = 
horizon (t) = H (et ) 1) (4) 


grows exponentially. 


Inflationary epoch 


With the blinding clarity of hindsight, we now see that all these problems can be solved 
if the early universe went through an inflationary epoch,’ during which it expanded 
exponentially. As was just shown, dporizon(t) would then grow exponentially, thus solving® 
the horizon and homogeneity problems, and as we will see below, also the flatness problem. 
In addition, this exponential expansion would greatly dilute the density of any hitherto 
unobserved and hence undesirable relics. 

Before I go further, let me warn the reader that as this book goes to press, there is 
considerable debate regarding whether inflationary cosmology is still viable.” You will have 
to form your own opinion over the coming years. 

As you have known since chapter V.3, the desired exponential expansion can be pro- 
duced readily by a constant energy density corresponding to some effective cosmological 
constant. With a cosmological constant, (1) is replaced by 


Qy = HQY(2Q, + Qry — 22q) (5) 


With (2Q, + Qy, — 22,) < 0, the flow around small 2; goes from being unstable to stable, 
and the flatness problem is solved. Any initial value Q; gets driven to 0. 

One triumph of inflation is that it can account for the origin of the density fluctuations 
needed (as explained in the preceding chapter) for the growth of structures. Where did 
these density fluctuations come from? How did they get put in at the Big Bang? These are 
all questions you might have asked. You might even have thought of quantum fluctuations, 
which, since we live in a quantum universe, are inevitably present. Before the inflationary 
universe was proposed, people would have immediately dismissed the notion of quantum 
fluctuations being responsible for the density fluctuations in the primeval universe; the 
quantum length scale on which these fluctuations occur would seem to be irrelevantly mi- 
nuscule compared to cosmological distance scales. However, in an inflationary scenario, 
the fluctuations could have stretched out enormously. Furthermore, after this inflationary 
stretching, the resulting spectrum of density fluctuations would end up having no char- 
acteristic length scale, giving rise to the scale-free spectrum proposed by E. R. Harrison 
and Y. B. Zeldovich long before inflation was invented. I will leave detailed calculations 
of the primeval density fluctuation to more specialized texts. Remarkably, fluctuations be- 
gotten by the quantum and stretched by inflation could have led to the structures we see 
around us. 
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An inflationary universe 


That the universe went through some sort of inflationary epoch during which it expanded 
exponentially seems quite compelling, but the specifics of any of the proposed mechanisms 
should probably be taken with a grain of salt and a smile. 

Numerically, to accord with observation, the scale factor a(t) needs to have expanded by 
a factor of 10°”, starting from ~10~*° second after the Big Bang to ~10~° second after 
the Big Bang. The required number of e-foldings is thus 30 log 10 ~ 70. Note that these 
numbers do not come out of the theory but are required to accord with experiment. 

In Einstein gravity, it is easy to cause the universe to expand exponentially: all you have 
to do, as we saw in chapter VI.2, is to introduce a constant energy density in the vacuum, 
effectively a cosmological constant. In particle theory, itis easy to produce a constant energy 
density: a scalar field that does not vary in spacetime would do that, as you learned in 
chapter VI.4. Thus, with the benefit of hindsight, it is doubly easy to make the universe go 
through an inflationary epoch. 

Amusingly, Einstein’s introduction of the cosmological constant, far from a blunder as 
the uninformed called it, turns out to be essential for cosmology, both observationally and 
theoretically. 

Indeed, you worked out in exercise VI.4.4 that, for a scalar field governed by the ac- 
tion Sccalar = — J d*x./—8(5 (96)? + V()) (with (96)? = g,,,9,,03,¢), the energy mo- 
mentum tensor is given by T“” = a"pd"o — ghtr(5 (d¢)* + V(o)). For @(x) constant, 
THY = —g"*V(). Actually, we don’t even need to calculate the energy momentum ten- 
sor; we can see directly from the action that, as explained in chapter VI.2, a constant in 
V(@) corresponds to a constant energy density. 

One difficulty confronting inflationary theories is known as the graceful exit problem: 
how do you end inflation once you have had your fill of the 70 e-foldings? Another puzzle is 
why the effective cosmological constant was once large (so as to dominate the contribution 
of the relativistic matter, namely radiation in the generalized sense, to the energy density) 
but is now so incredibly small.* 

Since we know almost nothing about the origin of this scalar field, called rather unimag- 
inatively the inflaton, and the physics that goes with it, people feel free to draw whatever 
V(¢) would produce a desired outcome, for example, the “slow roll” potential shown in 
figure 2. In this pictorial analogy, the inflaton ¢ is supposed to start out on the nearly flat 
plateau and slowly roll downhill (see the appendix), ending up in a minimum that is many 
orders of magnitude smaller. 

Not surprisingly, this has generated in the theory literature a multitude of scenarios that 
go under such names as old inflation, new inflation, chaotic inflation, eternal inflation, and 
stochastic inflation. Invent your own! In chaotic inflation, one does not need to fine-tune 
the potential V (); instead the universe is vast and inhomogeneous, with V (¢) taking on 


* We will return to the cosmological constant in chapter X.7. 
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Figure 2 The inflaton potential. 


random values in different regions, so that some regions inflate rapidly, while others may 
inflate less rapidly or even shrink. One then argues that the probability is overwhelming 
that we would find ourselves in a region that has undergone an inflationary phase, just 
because such regions occupy exponentially more volume than others. 

As I said earlier, in recent years, inflationary cosmology has been faced with mounting 
difficulties and increasing criticism. I close by mentioning that there are a number of 


interesting alternatives.® 


Appendix: Slow roll scenario 


Here I sketch the popular slow roll scenario. Start with the energy momentum tensor for a scalar field ¢, 
the inflaton, namely T“” = 0460" — g!'?(5(ag)? + V(@)). It takes on the form of a perfect fluid T“” = (p + 


P)U“U" + Pg”, with U,, = (3,¢//—(@), 0), 


p=—34(9¢) + VP) (6) 
and 
P =—3(99)? — Vio) (7) 


If we imagine that @(t, X) = #(t) could be independent of space, then p = 50" + V(¢) and P= 5" —V(), 
and the inflaton obeys the equation of motion 


$+ 3Hd+ V'(o)=0 (8) 


and the expansion of the universe is 
H? =} (36° +V@)) (9) 


in units with 82 G = 1. Note that since we suppressed the spatial dependence by decree, the equation of motion 
(8) is that of a point particle rolling in the potential V(). Crucially, the expansion of the universe provides a 
friction term ~H@. 

We can obtain an inflationary epoch if the inflaton varies sufficiently slowly in time so that 6? « V(@). Then 
(9) becomes 


H*~ 1V() (10) 


with the Hubble parameter approximately constant in time, producing an effective cosmological constant. If we 
also? impose ¢ « V’(@), then we obtain from (8) 


3H¢ = —V'(d) (11) 
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Using (11) and (10), we have ¢? ~ (V’/H)* ~ V/V, and so we can write the condition ¢* « V(@) as 
(V’/V)2 <1. Furthermore, since ¢ ~ -V'/Vi, we have ¢ ~ (V'V"/V) — (V3/(2V2)), and thus the other 
condition ¢ « V'(¢) gives V’/V <1. Thus, it is customary to define two parameters ¢ = 5(V'/ V)? and n= 
V"/V. For the slow roll approximation to hold, we need both e < 1 and n < 1. Note that the second condition 
also amounts to saying that the variation of e(¢) with respect to ¢ is small. 


Notes 


1. As mentioned in the preceding chapter, the interested reader should also consult specialized textbooks on 
cosmology. 

2. Some authors introduce two distinct terms, calling this horizon a “particle horizon” and the horizon around 
black holes an “event horizon.” I think that it is easy enough to distinguish them by context. 

3. And if not, replace light by messengers in ancient times. 

4. We also mentioned there that the concept is slightly sloppy and involves a “cosmic conspiracy.” 

5. The early history of inflation is too involved to go into here. A detailed exposition may be found in A. Guth, 
M. Mukhanov, and S. Weinberg. We might mention here early work by A. Starobinsky, B. Chirikov, 
M. Mukhanov, R. Brandenburger, A. Guth, H. Tye, A. Zee, K. Sato, and M. Einhorn, among others. See 
in particular p. 180 of Guth. 

6. It has been pointed out that there is a hidden assumption about the measure of the initial pre-evolutionary 
data. Mathematically, an arbitrary present configuration of the universe could be evolved backward in time 
to some initial configuration. The correct statement is that inflation can take a generic initial configuration 
and iron it out. 

7. For instance, Max Tegmark, writing in New Scientist in 2012, states that the inflationary scenario should be 
abandoned. He puts it humorously as follows: “You know how sometimes you meet somebody and they’re 
really nice, so you invite them over to your house and you keep talking with them and they keep telling you 
more and more cool stuff? But then at some point you're like, maybe we should call it a day, but they just 
won't leave and they keep talking and as more stuff comes up it becomes more and more disturbing and 
you're like, just stop already? That’s kind of what happened with inflation.” 

8. For example, the bounce theory can solve all the problems that inflation can solve (R. H. Brandenberger, pri- 
vate communication). See R. H. Brandenberger, “Introduction to Early Universe Cosmology,” PoS ICF12010, 
001 (2010), arXiv:1103.2271 [astro-ph.CO]. 

9. Some texts assert erroneously that the condition ¢ « V'(@) follows upon differentiating the condition 
¢? « V(¢). It is known to any student of calculus that f(t) « g(t) does not imply that f’(t) « g(t). For 
example, f(t) = sin(100r)/10, g(t) = 1. 


Recap to Part VIII 


The stuff contained in the universe causes the universe to expand, and the expansion of 
the universe affects the density of the stuff contained in the universe. Knowing Einstein’s 
field equation and the properties of the stuff allows us to work out the expansion history 
of the universe. 

To first approximation, the universe can be described as a struggle between dark energy 
and dark matter. This cosmic contest can be mapped out in a 2-dimensional diagram. 

In speculating about the history of the universe, we should keep in mind Gamow’s 
principle: ifwe understand the physics characteristic of a certain energy scale, then we can 
work out, in the grand tradition of physics, how the universe behaves when its temperature 
is of that scale. 
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Gravity at Work and at Play 


| Part IX, IX | Aspects of Gravity 


Parallel Transport 


Keep on pointing in the same direction 


Way back in chapter 1.7 (titled “Differential Geometry Made Easy, but Not Any Easier” 
as you may recall), I introduced the notion of parallel transport in the context of curved 
surfaces. While some uninitiates might find the notion a bit difficult to grasp, parallel 
transport is in fact firmly rooted in common everyday intuition. Think of a patent clerk in 
Bern walking along a closed path, carrying a spear and pointing it in the same direction 
the whole time. For a vector on a surface, we simply parallel transport it in the ambient 
Euclidean space, keeping its “feathered end” in the surface, and then chop off the compo- 
nent sticking out of the surface. The result, as we saw back in chapter I.7, is closely related 
to the concept of covariant derivative. 

In this chapter, we generalize the notion of parallel transport from the surfaces of 
chapter I.7 to Riemannian spacetimes. The key is the covariant derivative discussed in 
chapter V.6. Consider a curve C defined by x(t) and parametrized by t. The curve C is 
not necessarily a geodesic, just some curve. Let a vector S“ be given at some point on the 
curve located by t;. You may recall that in appendix 1 of chapter V.6, the notion of covariant 
derivative along a curve was introduced. Here we simply set this derivative along the curve 
C to zero. We then determine the vector S“(t) at other points along the curve by integrating 
the first order differential equation 

dS!(r) 

dt 


+ ae (x(t))V°(t)S° (rt) =0 (1) 


with the given vector S“(t,) providing the initial condition. Here V" = dx” denotes the 
tangent vector along the curve. The vector S“ is said to be parallel transported along the 
curve C. 

We should be careful not to write S“(t) as S“(x(r))! The erroneous notation S“(x(t)) 
would suggest that S“(x) exists, that is, as a vector field having a value at any point x in 
spacetime, and that S“(x(r)) is equal to the vector field S“ (x) evaluated on the curve. This 
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may sound like nitpicking, but it’s not: we must keep straight what we are doing. Similarly, 
V(r) is only defined on the curve. 

In contrast, the metric and the Christoffel symbol are defined (with possibly more than 
one coordinate patch needed) all over the manifold. Thus, in (1), P e (x(t)) denotes I ba (x) 
evaluated at the point x(t) on the curve. 

Itis useful and natural to ask what differential equation S,, = g,,,S” would satisfy. Simply 
differentiate and use (1): 

dS,(t) dx? 

dt dt 


Oo dS” lon 
(880) Sw + suv = (8,8u0 = Sash a) ves 


Inserting the definition of Pio? We find that the expression in parentheses simplifies to 


r that is, the Christoffel symbol with its upper index lowered. Thus, we obtain 


o-Hp? 

dS,,(T) 
dt 

Notice that (1) and (2) differ by a sign: we parallel transport S“ and S,, with a crucial 


— Tipe) V(x) S,(t) = 0 (2) 


difference in sign. This immediately implies that if we parallel transport two different 

vectors S“ and T,,, then 
d(S"T,) 
dt 


(-r#,5°7,, + S*T?,T,) V? = (-14,5°7, +T#,S°7,,) VP =0 (3) 


since I" is symmetric in its two lower indices. This result makes sense: parallel transporting 
two vectors S“ and T,,, carefully keeping them “pointing in the same direction,” so to speak, 
we would have every right to expect that their scalar product S“7,, would not change. 


Covariant derivative and parallel transport 


At this point, bells should be ringing. You might recall that we went through an entirely 
similar discussion for covariant derivatives in chapter V.6. Given two vector fields W“ (x) 
and U,,(x), the covariant derivatives D, W“ and D,U,, are defined in (V.6.11) and (V.6.12), 
respectively, with opposite signs in precisely such a way that D,(W“U,,) simplifies to 
a,(WHU,,). 

Imagine moving through a vector field W" (x) along a curve C (again, not necessarily a 
geodesic) defined by x(t). Now it makes sense to define W(t) to be equal to W" (x(r)): itis 
the value of the vector field you experience at the point parametrized by t as you move along 
C. Then SO = a, w(x(r)) = V9(c)a,W. In other words, 4H + r4 vow? = 
V°D,W*“. We conclude that if vector field W“(x) is (covariantly) constant in the sense 
that its covariant derivative D,W“ vanishes, then the vector W“(x(t)) you experience as 
you glide through the vector field is precisely parallel transported. 

Everything is coming together. Already, back in chapter 1.7, when we were discussing 
surfaces, Riemannian spaces that you can hold in your hands, or at least in your mind’s 
eye, we went from the ordinary derivative (1.7.11) to the covariant derivative (1.7.12) by 
dropping the component sticking out of the surface, as already alluded to in the opening 
of this chapter. 
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Figure 1 Parallel transport of a vector around a curve can be reduced to parallel transport 
of a vector around two smaller curves. 


Shortest distance and straight lines 


Indeed, when you saw the parallel transport equation (1), another bell might have rung. It 
should have reminded you of the geodesic equation 


dV¥ (tr 
ee) ) + Poe (R(T) V P(t) V(r) =0 (4) 
A geodesic is a curve whose tangent vector V(r) is parallel transported as we move along 
the curve. This is just the curved space generalization of the man-in-the-street statement 
that a straight line gives the shortest distance between two points. The word “straight” can 
only mean that the tangent vector keeps pointing in the same direction. 


Riemann curvature and parallel transport 


This discussion suggests yet another way for the mite geometers (namely us) to measure 
curvature. Consider a closed curve C (in general not a geodesic) starting and ending at the 
point P. Let’s parallel transport a vector S,, along C starting at P. 

We ask whether the vector S,, comes back to itself when we go around the closed curve. 
In flat space, it will. Thus, the extent to which it does not provides us with a measure of 
the curvature. In other words, we want to calculate AS,, as we go around C. 

The first remark is that we can take C to be infinitesimal. The argument is that given a 
curve C, we can always decompose it into smaller pieces.* As shown in figure 1, we can 
write C = C, + C, as the sum of two smaller curves C, and C). Pick a point P lying on both 
C, and C,. Parallel transporting S,, around C is equivalent to parallel transporting S,, first 
around C,, starting and ending at P, and then parallel transporting S,, around C), again 
starting and ending at P. Clearly, the curved segment shared by C, and C, are traversed 
twice in opposite directions, producing canceling contributions to AS,,. Repeating the 
argument, we can cut any given closed curve into smaller and smaller closed curves. 


* You are probably familiar with this sort of argument from a variety of contexts in physics (most likely in a 
course on electromagnetism) and mathematics. 
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Alternatively, argue that any irregular shaped area enclosed by a closed curve can be 
approximated by many small rectangles. Shades of integral calculus. Thank you, Newton 
and Leibniz! 

So, we take C to be infinitesimal, describing it by x(t), so that the endpoint xp = x(t.) 
is equal to the starting point x; = x(t), with some initial t; and final tz. Our discussion 
evidently goes through for both space and spacetime. A trivial notation alert: My use of t 
here does not suggest a connection with time; it is just a parameter along the curve, which 
we are in fact taking to be closed. 

Taking the limit in which the closed curve shrinks to zero, we determine the curvature 
of the manifold at the location of the closed curve. 


The Riemann curvature tensor emerges 


So, boys and girls, let us calculate away. Around a closed curve, 


TE dS, TE 
AS, = S(t) — S(t) = / a. = dt Pio X(T) V° (T)S,(T) 


T] T Ty 


XE 
=f dx? TP, (x(t))Sp(t) (5) 
+] 


where we used V? = dx” in the last step. 

In theoretical physics, when faced with a fairly involved calculation, it is always a good 
habit to anticipate the answer. Since parallel transport is linear in S, we expect AS to be 
proportional to S. Also, AS vanishes as the closed curve shrinks to zero. Do we expect it to 
be proportional to the perimeter of the closed curve or to the area enclosed by the closed 
curve? By following a curve for a bit and then backtracking along the same curve to get 
back to the starting point, we have traced a closed curve with a nonvanishing perimeter 
but enclosing no area. But AS = 0 for such a closed curve, since the changes in S are 
reversed on the return trip. Thus, AS must be proportional to the area enclosed, not to the 
perimeter. 

Since area is quadratic in length, we expect an infinitesimal area element to be given 
by something vaguely like 5x75x*. We don’t quite know what that would be, but it must 
carry two indices like a 2-indexed tensor a°*. Therefore, AS,, has to be proportional to 
S,a°*. We need a 4-indexed tensor of the form R/,, to convert the 3-indexed right hand 
side to the 1-indexed left hand side. Hence we anticipate the answer to have the form 
AS, = Roig, Spd. 

Well, let’s not be coy about it. You know, and I know, that if there is any justice in this 
world, Re , has to be none other than the Riemann curvature tensor Rio , We now verify 
our suspicion by doing the integral in (5). 

Before arithmetic overwhelms us, letus pause and reflect that, for an infinitesimal closed 
curve, the integrand Z in (5) can be Taylor expanded from the value it has at the starting 
point Z(x(rt)) =Z(xp + 0,2 (xp(x(t)* — at) +--+. The constant term contributes noth- 
ing to the integral, since shee dx° = xf — x7 =0 for a closed curve. This also makes sense, 
since AS,, obviously must vanish as the closed curve shrinks to nothing. 
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The contribution of the linear term is proportional to ¢ dx? (x — x;)* = f dx? x*, since 
f dx°x} =x} ¢ dx? =0. The object a°* = § dx°x* carries information about the size 
~(x — x7) of the loop and has dimension of length squared. So you should not be surprised 
that it gives the infinitesimal area enclosed by the closed curve. To verify this, simply 
evaluate it for a small rectangle (see exercise 1). Notice that a°* may be positive or negative 
according to whether we go around the curve clockwise or counterclockwise, and it is 
antisymmetric in its indices. To see this, write a°* as an integral over t and integrate 
by parts: 


o a 
ave fare x* = pack x? =—a'? (6) 
dt dt 


Now that we know that only the linear term matters, we Taylor expand the two factors 
Wie (x(r)) and S,(v) that make up Z = ee (x(t))S,(t): 


Te (x(t) =TP Op) + 02, pe — ay +++ (7) 


and 
S,(T) _ S(t) on Pi XD) V(t) S, (tp (t _ T) foes 
= S_(t) +, GDS (t@ — xpress re 


where we have used V? = dx” (and changed the dummy index w to A). Thus, the linear 


term in Lee (x(t))S,(t) is [Po (xp S(t) + ee (pF, ADS (T)] — xy)". 
Finally, putting things together, we see that the integral in (5) gives 


AS, = (0,02, (ep +P, ph, (xp ]S,(tda™* (9) 


(By now you might have caught on that one of the abilities you need to learn general 
relativity is to change dummy indices in your head.) 
This important result can be written elegantly as 


AS, = Re 


2 Rar Spa?” (10) 


as we warmly welcome the natural emergence of our beloved Riemann curvature tensor 


lw o” pr Ko pA 


Rega = (G02, +12,0%,) — (re, +T,0%,) (11) 


This amounts to an alternative derivation of the Riemann curvature tensor, a derivation 
showing clearly how the dF and IT’ terms come about: the former from the variation of 
I, the latter from the variation of the vector being transported, as we move around the 
infinitesimal loop. Notice that the Riemann curvature tensor is automatically defined to 
be antisymmetric in its last two indices, since the area element a® is antisymmetric. You 
might have also realized that this derivation is intimately connected with the derivation 
given in chapter VI.1 based on the commutator of two covariant derivatives [D,,, D,]. See 
also the appendix. 

The amount by which a vector parallel transported around a closed loop does not come 
back to itself measures the local curvature. 
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Appendix: Transporting vectors via alternative routes 


Here I give another derivation of the Riemann curvature tensor, a derivation closely related to the one given 
in the text (in fact, essentially the same derivation repackaged). Let us parallel transport a vector S,, from 
x to x +a, and then from there to x + 5a + 6b. Along the first leg, we have, according to (2), the change 
6S,,(T) = Tip &(t)) 8a? S,(r). To find the change along the second leg from x + da to x + 5a + 5b, we use this 
nifty formula again but evaluated at the new starting point x + da instead of x (of course). We thus obtain the 
change 


TY (x(t) + 6a) 8b? S,(t + 81) ~ (ry,@@) Ba ba*A,0",)) 5b? (xc) es r¥,3a*S, ) 


~ PY (x(t))8b"S, (tr) + 5b? a” (02,50 fe Ty Fis) (12) 
where in the last step we used the nifty formula yet once again, namely the result just obtained for the change 
along the first leg. The total change in S,,, after being transported from x to x + da + 5b via x + 6a, is thus 
given by 


8S, (from x to x +a + db via x + 5a) =P" 6a"S, + TY SS, + 5b? a (ar, r rer) Ss (13) 


Suppose we parallel transport S,, from x to x + 6a + 6b via x + 6b instead. The difference in the resulting S,, 
going by the two different routes is given by subtracting from (13) the same expression with da < 5b, namely 


AS, = 5b? 5a’S, (ary, Be 2) — (3 oe 2) i. 


As expected, the terms linear in da and db in (13) drop out. We have derived the Riemann curvature tensor yet 
one more time. 


Exercises 


1 Evaluate a°* for a rectangle. 
2 Evaluate a®* for a small circle. 


3. Inelementary discussions of curved surfaces, the reader is often invited to draw a triangle on a sphere and 
to observe that the three angles add up to more than 180°. (Indeed, we already mentioned this fact in the 
prologue to this book.) Show by parallel transporting a vector around such a triangle that the angular excess 
measures the curvature. 


IX ° 2 Precession of Gyroscopes 


Parallel transport in action 


We now apply the notion of parallel transport to study the precession of a gyroscope in 
curved spacetime. In 2004, a precision gyroscope! was launched in a satellite moving in 
an earth orbit, giving us yet another test of Einstein’s theory. For a textbook treatment, we 
take the orbit to be circular and ignore the rotation of the earth, so that we can calculate 
the precession in Schwarzschild spacetime 


-1 
ds? =— (1 = ‘s) dt? + (1 = ‘s) dr? +r? (ae? + sin’ ode’) 
r r 


This effect, known as de Sitter precession or geodetic precession, was first calculated in 
1916 by Willem de Sitter. 
Parallel transport of the spin vector gives 


= +r v's*=0 (1) 
As usual, we will work in the equatorial plane (9 = 2/2), so that the 4-velocity is given 
by V4 =(V', 0,0, V%). We take S# = (S’, S’, 0, S%). In the rest frame of the gyroscope, 
the spin vector is purely spatial, S“ = (0, S), while V“ is purely temporal, and hence 
SyyS4V" =0. Since this orthogonality condition g,,,,S“V" = 0 equates two scalars, it holds 
in all frames. 

Back in chapter VII.1, we determined the circular orbit around a massive object, ob- 
taining V' = 44 —e/(1— 8) and V8 = @ =1/r?, with r the (constant) radius of the 
circular orbit. Furthermore, we found that the two conserved quantities « and / are given by 
= (1-S)a- srs)! and /? = $rsr(1— srs) (see the discussion around (VII.1.7)). 


We also learned, somewhat to our surprise, that the angular velocity Q = a = ae still 


obeys Kepler’s third law 
rs _ GM 


= 2 
2r3 r 2) 
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(I remind you that Q, evidently a constant, is defined in terms of coordinate time t, not 
proper time.) 
Having reviewed the properties of the orbit, we now impose the condition 


SuvS"V" =0= Vi (94,58! + Spg2S*) 


to obtain a relation between S‘ and S?, namely S’ = r? (1— a QS’, 


The gyroscope precesses 


You should now be able to work out how the spin vector S“ precesses as the particle goes 
around its orbit. Simply plug into the various components of (1) and chug. Try it before 
reading on. 

Insert V? = QV’ into (1) and note that dividing through by V’ converts the t derivative 
to at derivative. Referring to the Christoffel symbols listed in the collection of formulas at 
the end of the book, you see that only a few terms in (1) do not vanish. We find 


1-38) Qs? =0 (3) 


The  =t component of (1) merely provides a consistency check that parallel transport 
maintains the condition g,,,S“V" = 0, while the 4. = 6 component simply shows that it is 
consistent with the symmetry of the situation to set S° = 0. Combining the two equations 
in (3), we obtain as’ + 22S" =0 with 


1 

3 2 

a9 (1 = ¥s) (4) 
2r 


From the elementary solution S” « sin Q,t, we see that after one orbital revolution, 


S” x sin (=>) = sin (2 (F - 1)) 
Q Q 


fails to return to 0. The precession angle is thus given by 


ay=2n (1-22) = 21 1- (1-33) =n (22€) ‘ 
Q 2r 


2r 
forrs <r. 


Nie 


Appendix: Lense-Thirring precession 


The de Sitter precession must be distinguished from the Lense-Thirring precession (calculated in 1918) of S” 
caused by the rotation of the massive body, the earth in this case. To calculate the latter, we invoke frame dragging 
from chapter VII.5. Far from the rotating body, the frame rotates, and hence the gyroscope precesses, at an angular 
velocity (for an orbit in an equatorial plane) of 

819 , ats 2GMa a 2GJ 6) 


a) 3 =) 
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where I used the result you obtained in exercise VII.5.2. (Plugging in the numbers for a satellite around the 
earth, the Lense-Thirring precession amounts to ~0.05 arcsec/year, while the de Sitter precession comes out to 
be ~7 arc sec/year.) 

We can also determine the Lense-Thirring precession using (1), plugging in the Christoffel symbols appro- 
priate for a rotating body. Note that we do not need the full Kerr metric, only the asymptotic behavior of g,, 
for large r. We will see in chapter IX.4 that this leading term, which determines the Lense-Thirring precession, 
can be fixed by general considerations. The logic actually goes full circle: as we explained in chapter VII.5, an 
astronomer could determine the angular momentum of a rotating body, be it a star or a black hole, by measuring 
the Lense-Thirring precession of a gyroscope orbiting that body. 


Note 


1. The experiment, known as Gravity Probe B, was first conceived in 1959. The technological marvels involved 
in constructing a working gyroscope of the required precision are breathtaking. I urge you to search for 
“Gravity Probe B” on the web and read about the experiment, including the controversy it generated. The 
experiment ultimately took 50 years and cost $760 million. See Physics Today, July 2011, p. 14. 


IX ‘ 3 Geodesic Deviation 


Separation between geodesics 


Euclid asserted that parallel straight lines will never meet in his space. Indeed, as is well 
known, the failure of Euclid’s famous axiom ushered in the development of modern 
geometry, and the extent of the failure measures the curvature of the space. A familiar 
example is of course the globe we live on. Suppose Ms. U and Mr. P (remember them 
from chapter I.3 and part III?) start out in two neighboring towns on the same latitude 
and fly due south along geodesics, namely lines of constant longitude. We all know that 
the separation between them will change as they move along, eventually vanishing at the 
South Pole. In contrast to Euclidean geometry, two parallel straight lines could eventually 
intersect. 

Let x“(t) and y“(r) be two nearby geodesics on a Riemannian manifold. In the example 
just given, they could be two lines of constant longitudes, with t given by the latitude times 
a constant. Write y“(t) = x"(r) + e#(r) and study how e“(r) varies with t. Subtract one 
geodesic equation 

d?xt dx” dx* 


Ve —_— —— = 
+i, @(r))—-—— =0 (1) 


dt? 
from the other 
d2 yh 


Wee + rO()) 


ar ao" a0 0) 
dt dt 
and expand to first order in €. See figure 1. 

Just as we are about to plug and grind, we see Professor Flat sauntering toward us, 
mumbling “Tsk tsk,” and we immediately realize why. We hurriedly say, “Yes, Professor, 
we could go to locally flat coordinates and save ourselves a lot of work!” 

So, let the coordinates be locally flat at the point P described by x“(r) for some specified 
value of t. Hence It’, (x(r)) =O and If, (y(t)) =f, (x(t) + €(t)) = €7,'h (x(t). Then 
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Figure 1 Two nearby geodesics can deviate 
from or approach each other. 


the difference between (1) and (2) collapses in leading order in € to 


del dx” dx* 
p lia = 
dt2 OU in OT) dt dt : 6) 


; wage 2 F nk. 
But we want the covariant second derivative os , not the ordinary second derivative 


=. . As mentioned in chapter V.6, the covariant derivative of « along the geodesic is 


defined by 


De" — de® w dx” 4 
=F +a (4 


. . Mo. . . . 
which, unlike i, is assuredly a vector. The covariant second derivative of ¢ along the 


De 
Dt 


. 2h. Fi . . 
geodesic, oe, is defined by the covariant derivative of 


1 
Pe~ as a vector (of course). In other words, 


De# = d (det dx” , dx” (de® _) dx® 
= +r" e) +r" ( +7. €* 
Dt2— dt ( T T ) dt oK dt 


along the geodesic, treating 


air aes erage 


Te ae (5) 


where we have exploited local flatness in the second equality and used (3) in the third 
equality. In the last equality, we merely rearranged the dummies. 

Now we recognize the expression (0,I'", — 0,P°#,) in (5) as just what the Riemann curva- 
ture tensor Re = (0,0, + Pe ee eae aes ae) reduces to when evaluated 
in locally flat coordinates. Thus, we have derived, quick and fast, the equation! of geodesic 
deviation: 

De p ax? dx? 4 

Z dx? dx? 6 
Dr? oh ar dt (6) 
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Although we derived this result in locally flat coordinates, the by-now standard argument 
asserts that since both sides of (6) transform in the same way, it holds in any coordinate 
system. 


The Riemann curvature tensor pops out, as if by magic. Of course, at least in hindsight, 
Delt 
‘i Dt2 
tangent or velocity vector oe at that point. The tangent vector must appear twice, not once, 


we know that it must: the vector has to be linear in €*, and it must depend on the 


since we don’t have a tensor with an odd number of indices that measures curvature. We 
also knew from the start that 0! must be involved. According to Euclid, the separation 
Deh 
Drt2 


between two straight lines could only grow linearly; the second derivative , sometimes 
called an “acceleration,” reveals the presence of curvature. 

For most purposes in physics, we naturally deal with timelike geodesics, but clearly (6) 
also applies to spacelike geodesics if we replace t by the proper length along the geodesic. 
This hardly merits a remark since, after all, we obtained the formalism for determining 


geodesics with curves in space in the first place, not curves in spacetime. 


Geodesic deviation, tidal force, and congruence of geodesics 


We can now make contact with the Newtonian tidal force discussed way back in chapter I.4. 
Remember the ring of balls falling? The separation between two nearby balls evolves 
according to (1.4.9) 

2¢i 

o = —Ris/ (7) 
which you now recognize as the Newtonian analog of (6). 

We studied the separation between two geodesics, but more generally, we could consider 
a collection of geodesics x“(t, 0), distinguished from each other by a label o. For instance, 
in cosmology, on distance scales such that galaxies could be treated as idealized mass 
points, with each galaxy tracing out a geodesic, the entire collection of geodesics could serve 
as a coordinate system for the universe. The label o would then be 3-dimensional. Indeed, 
we have already mentioned this possibility when we discussed comoving coordinates back 
in chapter V.3. 

In Italo Calvino’s masterpiece Cosmicomics, the narrator, a man named Qfwfq, falls 
through spacetime, with his worldline tantalizingly close to that of a beautiful woman 
named Ursula H’x and that of a man named Fenimore. He is desperately in love with 
Ursula, but try as he may, he can’t decrease the separation between his geodesic and 
her geodesic, seething in dismay and anger as he watches her geodesic and Fenimore’s 
geodesic getting closer and closer to each other, or so it seems to him. The whole thing is 
told in the mind of Qfwfq. 

This rather short chapter is now followed by four appendices. The first three are devoted 
to, respectively, a mathematically more sophisticated (not necessarily better in my opinion) 
derivation of geodesic deviation, the behavior of a bundle of timelike geodesics, and what 
can be proved about that behavior given various assumptions about the energy momentum 
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tensor. The fourth appendix is somewhat tediously long and devoted to Fermi normal 
coordinates. The reader encountering Einstein gravity for the first time could certainly 
skip over this, or perhaps read only the first section, in which I explain intuitively why 
what is said to be true* is in fact true. 


Appendix 1: Lie derivative and geodesic deviation 


I will use the concept of Lie derivative introduced in chapter V.6 to give another derivation of (6). Consider a 

“dense” collection of timelike geodesics x(t, 7), labeled by a real variable o and defining a surface. Mathemati- 

cians say that the geodesics are knitted together to form a surface. Define the tangent vectors V4 = dat and the 
dx" 


deviation vectors «’ = S—. Evaluate the Lie derivative 


Lye! = Dye* — DV" =[V, €}# 


de® dV# — d?x# — d?x# 
dt do dodt dtdo 


= Vac — 6a, Ve = (8) 
The first two equal signs merely state two alternative expressions for Cy. Now act with Dy on the equation 
Dye" = D.V" we just derived: 


Dy Dye" = DyD.V" = D-DyV" + Diy, qV¥" + Rig VV = RV (9) 


You derived the second equality in exercise VI.1.16, namely that for three vector fields U, V, W, we have 
Dy Dy W* — Dy Dy W* = Dy, y\W* + Re un V’W°. For the third equality, we used Dy V“ = V’D,V" =0 
(which you recognize as the geodesic equation) and [V, €]= 0 (which we just derived in (8)). Now note that 


Dye# = V"D eH = pe and so Dy Dye4 = Dee’ . Thus, (9) is precisely (6). Even quicker and faster! 


Appendix 2: The Raychaudhuri equation 


Mathematically, a bundle of timelike geodesics x(t, o!, 2, 0°), labeled by three real variables 0, is known as a 
congruence in a certain region if, in that region, each point lies on one and only one geodesic. This implies that 
the geodesics in our bundle do not intersect. As soon as they intersect, the “congruence” is over. As mentioned 
in the text, a congruence of timelike geodesics could serve to coordinatize that region. 

Pick a point P on one specific geodesic and denote the tangent vector by V“ = ae We have V"V,, = —1, the 
definition oft, and V“D Vv = 0, the geodesic equation. Notation alert! In the definition of V“, the differentiation 
with respect to t is clearly to be done holding the labels o fixed; we want to follow a specific geodesic. But 
throughout this book, the tangent vector to a geodesic has always been denoted by axe (since we have always 
considered one single geodesic at a time, or at most two geodesics, as in this chapter), and it would be odd to 
suddenly start writing an for the tangent vector. I think that it would be best, in this appendix and in appendix 4, 
to ask you to keep in mind what you are holding fixed by following the physics, rather than to have vertical bars 
all over the place indicating what is being held fixed, particularly in appendix 4 with its abundance of vertical 
bars (as you will see). 

The 3 vectors W4 = a (for the 3 possible choices of o; again, when differentiating with respect to one of 
the os, we hold the other two os and r fixed) orthogonal to V" span a 3-dimensional subspace. The matrix 
PHY = gh” + VEY” clearly projects into this subspace: P"”V, =0, P#”P.* = P#*, and PH” P,,, = 3. Then 


DW" 
Dt 


= V"D,W" =W'D,V" = BW’ (10) 


* In this connection, I quote from a review of QFT Nut for the American Mathematical Society: “It is often 
deeper to know why something is true rather than to have a proof that it is true.” See http://www. kitp.ucsb.edu/ 
members/PM/zee/revMath.html. 


556 | IX. Aspects of Gravity 


where we used (8) with a trivial change of notation. Think of BY as a matrix acting on W to tell us the rate of 
change of W. 

The idea is to derive an equation for B,,,,. First, note that V“B,,, = V“D,V,, = 3D, (V" V,,) =Oand B,,V" = 
(D,V,)V" = V"D,V,, = 0. In other words, B,,, lives in the 3-dimensional subspace. Recall how we decomposed 
2-indexed tensors in chapter 1.4. That piece of knowledge now comes in handy. Decompose B,,, as 


1 
Buy = Fy + 30 Paw + Our (11) 


Think of the geodesics as describing the motion of particles that form a cloud or fluid. You can see that each of 
the terms in (11) corresponds to a property of the flow. The trace @ = P””B,,, = D,,V" describes expansion, the 
symmetric traceless part 0, = 3(Byy + By) - OP, shear, and the antisymmetric part w,,,, = 5 (Buy — B,,) 
rotation. (If you think about the spatial components of the corresponding quantities for a vector field in flat 
spacetime, for example @;; = 33; V; — 9;V;), you would understand the origin of the names? expansion, shear, 
and rotation.) 

We are now ready to differentiate: 


DBuy Xd Xd Xd Xd 
Den TV Da Buv = V"DiDyVy = V*DyDiVy + V"[Da, Duy 


=D, (v*D,v,) is (v,v") (D,V,,) — V*R®,,,V, 


pave 


= — BB — RoyayV°V* (12) 


In the last step we used the geodesic equation and the definition of B,,,. This equation, known as the Raychaud- 
huri equation, governs how B,,, varies as we move along a geodesic. Not surprisingly, as explained in the text, 
the Riemann curvature tensor appears. Spacetime curvature, aka the gravitational field, changes B,,,. 


Often, we are mostly interested in knowing about whether a bundle of geodesics converges or diverges, a 


: ‘ , : DBuv : 
question determined by the expansion parameter 0. Since pe = gh? —’ | we could extract an equation for pe 


D Dt’ 
by contracting (12) with g#”. 
Contracting the first term on the right hand side of (12) calls for a bit of work: 


= 0,0" + 30? — w,,0"” (13) 


Notice, in the last step, the wisdom of decomposing tensors into pieces with different symmetry properties, as 
was advocated in chapter I.4. 

Contracting the second term in (12) with g””, we watch the Ricci tensor pop out. 

We thus obtain the desired result 


50° — 0,0" + oor” — RyyVEV" (14) 


This equation could be used to prove various theorems. To see how, first suppose that the geodesics are not 
= 0. Since V“ is timelike, the conditions V"B,,, = 0 and B,,,V” = 0 tell us that B,,, 
is also a purely spatial tensor. (This is also easily seen by going 


rotational, namely that w,,,, 


is a purely spatial tensor, just like P,,,. Hence o,,, 
to a frame in which V” = (1, 0) at the point in question: B; ; and P;; are the only nonzero components of the 
tensor B and P, respectively, and hence o;; are the only nonzero components of the tensor o.) Then we have 
Oyyor = ojo" >0. 


Thus, with the assumption w,,,, = 0, we obtain the inequality 


pv 


To go further, we have to deal with the last term, which we can write, using Einstein’s field equation, as 


RyyV"“V" =82G (7, ae 48.7) VeV’ =82G (TsVOV" ee ga v) 
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Appendix 3: Energy conditions—weak, dominant, and strong 


By making various debatable assumptions about T,,,,, one can prove various debatable theorems. Depending on 
their inclinations, various authors regard one or more of these assumptions, known as energy conditions, as 
“self-evident,” or at least “plausible.” We can most easily understand the energy conditions listed in the literature 
by supposing that T,,,, has the perfect fluid form T“” = (p + P)U“U” + Pg” (as has been discussed repeatedly, 
for example, in chapters III.6, VII.4, and VIII.1). We mention only a few here; you can make up your own. 


1. The weak energy condition states that TyyV"V" = 0 for all timelike Vs, which implies* that p > 0 and 
(0 + P)=0. 


2. The dominant energy condition presupposes the weak energy condition and requires in addition that 
T,,V” is not spacelike for all timelike Vs, namely that g”?(T,,,V")(T,,V°) < 0, which amounts to 
25 p2 
p= P*. 
3. The strong energy condition states that T,,V"V" > 3T(V"V,) for all timelike Vs, which implies 
(0 + P)>Oand(o+3P)>0. 


As an exercise, you could verify the stated implications for these three energy conditions. 
Going back to the inequality (15) 


— <—}6"— 8G (T,,V4V" — 3TV-V) 


we see that if we have the strong energy condition, we can conclude that De < 0, so that the geodesics approach 
each other. The congruence of geodesics “focuses.” 

The dark energy, aka the cosmological constant, is an interesting case. It violates the strong energy condition, 
since (p + 3P) = (p — 3p) # 0. Notice, however, that a positive cosmological constant, with P = —p and p > 0, 
satisfies both the weak and dominant energy conditions (barely). At one time, most physicists would say that 
the strong energy condition evidently holds, since both matter and radiation (recall chapter VIII.1) satisfy it. But 
what is self-evident to one person may not be so obvious to another! 

So what is a “reasonable” T“”? Quantum field theorists would probably say that whatever T“” is produced by 
a field theory satisfying basic principles (such as unitarity and causality) is physically reasonable. 

This is a good place to mention one apparently easy way to generate solutions of Einstein’s equation. Write 
down a spacetime metric that you like. Calculate its Ricci tensor and plug into Einstein’s equation, which specifies 
the T,,, that would produce that particular spacetime. It’s a cinch! The catch is that the 7,,,, you obtain this way 
would most likely not satisfy the various energy conditions. You don’t necessarily have the right stuff to produce 
the spacetime you like. 


Appendix 4: Fermi normal coordinates 


Intuitive motivation 


Way way back in chapter I.6, we discussed the fairly obvious fact that at a given point P we could always choose 
locally flat coordinates. In fact, we could have coordinates, known as Fermi! normal coordinates, that are locally 
flat not only at a single point but also along an entire geodesic. The mathematical demonstration, that these 
coordinates are always available to us, is rather long* and involved. So it would be best if I first describe a physical 
picture that makes the result more or less obvious. 

Consider an observer moving along a timelike geodesic y with the tangent vector V”. 

Our friend the Smart Experimentalist interrupts, “It’s obvious. I work in a lab attached to some planet going 
around some star, acted upon by gravity, so my lab is moving along a timelike geodesic. First, with gyroscopes I 
make sure that my lab is not spinning. Then I use my watch, namely my proper time, to mark the time coordinate, 


* One way to show this is to go to the rest frame of U” and write V“ = (cosh g, sinh g, 0, 0). 
¥ Published in 1922. 
+ As I warned you, this appendix is longer than the main text! 
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Figure 2 Construction of Fermi normal coordi- 
nates that are locally flat, not only at a single point 
but also along an entire geodesic. 


and the walls of the lab to set up the space coordinates. Right there I have the locally flat coordinates of my choice. 
With due respect to my great experimental colleague Fermi, isn’t that it?” 

Yes, that’s more or less it. But the long discussion we are about to embark on will yield more. Not only will we 
confirm SE’s intuitive picture, we will also obtain precise expressions (see (30) below) for the second derivatives 
of the metric evaluated on y. 

So, to continue, we are given the coordinates x”, and we want to find the Fermi normal coordinates y“ = 
(y°, y') =(T, y’). For convenience, we gave y® the nickname T. Focus on a point P on y and assign to point P 
the time y® = T =, the proper time r elapsed since proper time started ticking at some point in the past. See 
figure 2. 

“But that’s exactly what I said!” exclaims SE. Yes indeed, we reassure her. Now we set up a set of four 
orthonormal vectors ee, with w = (0, a) = (T, a). We revel in notational “redundancy,” writing the subscript 
T instead of 0 for emphasis. As SE suggested, we use er = V", with orthonormality 


Suven’p = Nap (16) 


Once this is set up at P, we parallel transport (our gyroscopes are of high quality!) the 3 es (that is, eV" = 0) 
so that orthonormality always holds. The parallel transport of ef is automatic by virtue of y being a geodesic. 


Tentacles consisting of spacelike geodesics 


To coordinatize the spacetime surrounding y, we, sitting at P, now send out tentacles* consisting of spacelike 
geodesics ¢(n“, P) with n“ a unit vector (that is, a vector satisfying 5,,n“n” = 1), determining the direction in 
which a particular geodesic ¢(n“, P) emanates from P. See figure 2. In other words, the geodesic ¢(n“, P) is 
characterized by the point P and the direction n® in which it sallies forth from P. 

More precisely, let W“ denote the tangent vector along this geodesic ¢(n“, P); then 


Ww" |, =eln® (17) 


We shall henceforth use the notation X|,, to indicate that the expression X is evaluated on the geodesic y. SE 
mumbles, “That’s a lot of mathematical mumbling for what should be obvious to a child.” I agree. 

Onward to Fermi normal coordinates. Suppose that, after a proper distance o, this particular geodesic ¢(n“, P) 
reaches some point Q. Then we assign to the point Q the coordinates 


y? =T, y! = on“ si (18) 
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Note that all these quantities depend on P as well as on Q: for example, o is the proper distance along the geodesic 
¢(n", P) from P toQ. 

A note about notational clarity versus precision: in (18) I have chosen to suppress the dependence on P and 
assume that you know that y” denotes the coordinates of a particular point Q. I prefer to make it somewhat more 
mentally challenging to the reader (who has to quickly remind himself or herself of what depends on what) than 
to produce a page bristling with even more subscripts and superscripts than it already has. In any case, if you 
don’t like it, grab a pen and mark the dependence on P and Q. As it is, I am already being excessively pedantic 
in writing in (18) “5! instead of just plain n!: what is meant is clearly y! = on!, y? = on’, y3= on’. 

SE exclaims, “I can’t stand all this math; it’s just common sense. Something happens in my lab. I point to it, 
the direction of my finger is n, the distance to that something is o, and the time on my watch is T.” 

I said, “Yup, that’s it. Enrico Fermi is all for common sense! I am also all for common sense! The Fermi 
normal coordinates of Q are given by y”.” 

To summarize, to determine the Fermi normal coordinates y“ of a point Q, we have to ascertain the point 
P on y from which a spacelike geodesic ¢ would reach Q. We measure the proper length o between P and Q 
along this geodesic ¢, and the direction n“ in which ¢ emerges from P. The spacelike geodesics might eventually 
intersect, as discussed in appendix 2, but as long as they don’t, every point Q in a finite region around the geodesic 
y is uniquely characterized by (T, n“, o), the time on our friend’s watch, the direction her finger is pointing in, 
and the geodesic distance from her to the point Q. Note that in the Fermi normal coordinates, the geodesic y is 
described by y“ = (T, 0), and the geodesic ¢ by y4 = (T, ons!). 


The metric in Fermi normal coordinates 


Now the “hard” part: determine the metric in Fermi normal coordinates 


F_ ax” ax” 
Sip = Bol ae aye 


(19) 


First, we have to relate y to x. For an arbitrary event Q in spacetime, its y coordinates are given by (18). 
What about its x coordinates x"(T, n“; 0)? (A friendly reminder: x” are the coordinates we started with, and y“ 
are the Fermi normal coordinates.) It is the solution of the spacelike geodesic equation* _ +7h i es =0 
with the initial position specified by T on y and initial “velocity” in the direction n“. (This is conceptually the 
same as the freshman physics problem of solving Newton’s equation to determine the position of a particle after 


a certain time, starting with some initial position and velocity.) To get oriented, note that x“(T, n“; 0) is just the 


point P and that oo l= ri |, = ef is just the tangent vector V“ to the geodesic y at P. Also, the tangent vector 
of the spacelike geodesic on which the point Q sits is given by 
axl 
W(T, n"s0) = (=) (20) 
00 / 7 na 


with W"(T,n%;0) = nvel. 
To relate? x"(T,n%;0) to y“, namely y* = (T, on“), we note that the geodesic equation is invariant upon 
rescaling o > f—‘o, with f an arbitrary real factor, under which W” > fW" and n% — fn“. Thus, 


X(T, no) =x"(T, fn®; f-'o) =x"(T, on) =x"(y) (21) 


Going back to (17) and (20), we have 


p\ o=0 m a\ o=0 m 
ein? = WH, = (= ) _ (= ay ) _ 9x |! (22) 
00 / 7 na dy? 00 Jr na Oy" 


ax 


ot | ax 
oy? ly 


from which we conclude = el. Plugging this and ae 


l= ef into (19), we find 


F 
8iply = Mp (23) 


It’s just (16)! 


* Here I am faced with the notational dilemma of whether to use X or to stick with x, as per a bad notation 
alert way back in chapter I.1. On balance, I think sticking with x in this context is a bit clearer, as I have already 
done in the text and appendix 2 of this chapter. 
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To impress the Smart Experimentalist, we need to go further! 


But our friend the Smart Experimentalist snickers, “You theorists, so much huffing and puffing for a result long 
obvious to me.” 

We agree. “Yes, but this is only zeroth order. The point is, now that the formalism is set up, we can calculate 
the metric to second order.” 

SE brags, “It is obvious; I already know what the metric to second order will depend on.” 

How about you? Did you guess that the Riemann curvature tensor evaluated on y will have to come in? 

As a warm up, tackle the metric to first order, namely the Christoffel symbols. Intuitively, both SE and we 
expect them to vanish. Normally, we use the geodesic equation to determine the geodesics. In this context, we 
already know the geodesics in the Fermi normal coordinates, namely ¢ described by y“ = (T, on“8!) and y by 


y" = (T,, 0), which we will now plug into the appropriate geodesic equation to get information on the Christoffel 


symbols. 

So, plug the geodesic ¢ into ce + Te oe ay = = 0. (See the notation alert in appendix 2; we are following a 
single specific geodesic here. Later, when we want to be extra clear, we will use partial derivatives and specify 
what variables are being held fixed.) Since y“ is at most linear in o, we learn that r', oe S> = 0, which when 


evaluated on the timelike geodesic y gives ry nin] = 0 (with* n' =n8', clearly). Since the direction of ii is 


| 2 


arbitrary, we conclude (show this!) we My ily = 


Similarly, plug the geodesic y into 4 ayt a n e Bee = 0toobtain 7, |,, = 0. Also, by construction we ee 
transport e# = (es eit) along y, so ee “(ef art rere ‘) |, =0.Ony, e& = 5% and so Trjly = =0Oand Vrrly = 


As expected, the Christoffel symbols vanish on y. 


The Riemann curvature tensor evaluated on y 


That was easy, but now let’s determine the first derivatives of the Christoffel symbols on y. The idea is to determine 
these derivatives in terms of the Riemann curvature tensor evaluated on y. 
First, the vanishing of the Christoffel symbols on y allows us to conclude immediately that 


u = 
Parl, = 0 (24) 
Furthermore, the Riemann curvature tensor simplifies to R“ kprly = (rt dsp -rt a 4) |,,- In particular, using (24) 
we obtain 
Ver, oly = R orly (25) 


To determine the other derivatives of the Christoffel symbols, we finally have to invoke the geodesic deviation 
equation (6) we derived in the text and which we rewrite in the form (which I now ask you to derive as an exercise) 


del yyy de* 
a Be ie + (re 


M ML A 
va,p ve er eva et R vprd WP WY “= 0 (26) 
with the tangent vector W” = ay for spacelike geodesics. 

Consider the collection of geodesics emanating from P and described by x“(T, n“;o). Let’s compare two 


geodesics with the same 7 but emanating from P and P’ slightly separated on y and study «4 = (3r), os 


(1, 0) = 5% (since y“ = (T, on“6!)). Plugging this into (26), we see that the first two terms vanish. Also, 
w’=(0, nisi). Evaluating what remains of (26) on y, we obtain (I Ri ly W’Wwe* =0, and thus 
ies ily = Ri rly something we already know from (25). 

We can also compare two geodesics both emanating from P but in slightly different directions and study the 


vA,p 


separation between them, ¢/), = (2), = (0, 05!) = 05". Plugging this into (26), we see that this time only 
Oo 


(a) — \ dn® 


the first term vanishes. Reminding ourselves that W” = (0, nisi ) =n%6", we obtain 


aryn?sy82 + (Py p — TT ek, + PPS, — Rip )n*5en"5;5%0 = 0 (27) 


vA, p AK” pv vk pr 


* We finally succumb to sloppy notation. 
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+ O(o7). Setting o to 0 in (27), we get 
n“6n?s?5* = 0. Thus, we obtain 


Next, note that y= a Mily + antsy ily + O(c’) = o (n*sP yr 


0 = 0, but then sadciing the terms linear in o, we find 3" 


vA, ply 
Av,p Rip dlyn 


I 
(i i + Ty, i) ly= 3 (Re + Ri) ly (28) 


To solve this for the Christoffel symbols, generate two other versions of (28) by cycling the indices (kij) > 
(ijk), version A: 


1 
Mw Mh 
(5 Rt Me, F ly= 3 (R Ski a Rij) ly 


and (kij) > (jki), version B: 


1 
(THATS, i) l= 3 (Riis + Ring) ly 


Add version A to and subtract version B from (28) to obtain,® using various symmetry properties of the Christoffel 
symbol and the Riemann curvature tensor: 


1 
Mi ily = 3 (Ri + Ri) ly (29) 


Phew! We have finally nailed down, in (25) and (29), all the relevant quantities evaluated on y. 
Now recall from (1.2.25) that guyp =Vyup +Vy.pp where Cyr = Suxl yp Using (23), Sucly = Nyx and 


Els = 0, we have Euv, poly = (Vy-vp,0 +o Vy. 
Using (24), (25), and (29), we finally determine the second derivatives of the metric as follows: 


817, ijly = 20 7.73, jly = —2R rir jly 


wile 


8ijuly = (Ti-jer t Oo dD) ly = (Rigj + Rij) by 


1 v? 
87i, jkly = (Pr.ij, at Vierj,%) ly= E (Re jKi = Rrix;) ar Rr ly = 3 (Re jKi + Rriji) ly (30) 


In the last step, we used the cyclic identity of the Riemann curvature tensor. Finally, using (25), we have easily 
Suv, Taly = Cyprian tT y.ur,aly = (Ruvar + Ropar)ly = 90. So, in the second order expansion of g,,,, terms 
involving x°x° and x°x' do not appear. 


The metric in Fermi normal coordinates 
By now, we have long forgotten the coordinates that we started out with and traded for Fermi normal coordinates. 
So we might as well denote Fermi normal coordinates by x”, and also drop the nickname T for the more 


respectable 0. The results of this rather long analysis are then summarized by 


As 
800 = —1— Rojo jlyx'x? 


1 kyl 
8ij = 955 — FRingtlyx"* 
2 j pk 
801 = 3 Rojxily x! * (31) 
(For example, go; = 580i, jexixk = a + (Rejxi + Rr) |xixk = = 3 Ro jnily xx" .) By construction, the Riemann 


curvature tensor evaluated on y depends only on x°, not on x’. 

That sure was a load of work. So what did we get shat all that? To summarize, for a given geodesic y, we have 
now determined, following the great Fermi, the metric in a small tube around y, good to second order. 

You could now work out the metric g,,, in Fermi normal coordinates to your heart’s content for various freely 
falling observers, for example an observer falling radially into a Schwarzschild black hole. Note that y has to be 
a geodesic. Along an arbitrary curve, it is not possible to arrange for all the Christoffel symbols to vanish. 
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Exercises 


1 Verify (6) for the sphere. 


2 The price of not heeding Professor Flat: derive the equation of geodesic deviation the hard way, without going 
to locally flat coordinates. 


3 Derive 
et ope yd y (pe pepe prere — Re) yeyred a0 32 
dt2 ! vA di y vA, p AK® pv ' © vKe~ par ver A= ( ) 
with the tangent vector V? = axe as usual. 


4 Verify the implications of the three stated energy conditions, strong, dominant, and weak (see appendix 3), 
for p and P. 


Notes 


1. I now go back to (1) and (2) to point out a slight subtlety. Each of the two geodesics has its own proper 
time, given by dt, = \/—g,,)(«)dx"dx" and dty = \/—8,,y(y)dy"dy”. We could of course trivially set t, = 0 
and t, = 0 at the starting point, and define e“ (6) = y“(r, = 6) — x(t, = 4). In other words, we study the 
separation between the two observers when each of their watches register the same time. The geodesic 
deviation equation describes how «“(5) changes with 6. 


2. Recall how the concepts of divergence and curl were explained when you first encountered them. 

3. Itis sometimes said that, in Einstein gravity, the common person-in-the-street statement that gravity attracts 
requires the strong energy condition. This statement in itself is somewhat misleading: the geodesics here 
represent the movement of point particles in a background spacetime produced by an energy momentum 
tensor satisfying the strong energy condition. 

4. Having been John Wheeler’s student, I am now under the impression, these many years later, that this is 
the kind of phrase Wheeler would have used. 

5. My discussion is based on the work of F. K. Manasse and C. W. Misner, J. Math. Physics (1963), vol. 4, p. 735. 

6. Alternatively, regard Ve, j as a 3-by-3 matrix labeled by jx, k, with (28) telling us its symmetric part. Work out 
its antisymmetric part. This is essentially equivalent to the procedure given in the text. 


Linearized Gravity, Gravitational Waves, and 
the Angular Momentum of Rotating Bodies 


Einstein’s unfinished symphony 


Einstein gave life to spacetime. Previously rigid, spacetime could now curve and move. 
Certainly no surprise, then, that Einstein gravity predicts the existence of ripples crisscross- 
ing the fabric of spacetime, what one writer refers to as Einstein’s unfinished symphony! 
Massive detectors have been built, with more to come, in an effort to tune in to the “song 
of the cosmos.” 

Theoretical physicists do not doubt* that gravitational waves exist. Indeed, there is 
already strong indirect evidence for gravitational waves with the discovery of a binary 
pulsar in 1974 by Hulse and Taylor. The change in the orbital period due to emission 
of gravitational waves could be accurately measured, and the data agreed extremely well 
with the prediction of Einstein gravity. The only question is when they will be detected: 
given our detectors, are there sufficiently powerful sources relatively nearby? An exciting 
new era will dawn with gravitational wave astronomy: hopefully, much will be revealed 
about the universe that we cannot see with electromagnetic wave astronomy. 

Consider a small deviation from the Minkowski metric and write g,,, = yy +h,» In 
chapter VI.5, we already worked out that to leading order 


1 
Ruy = —5 (Phyo — BA, — BAM, + 8,7.) + OC) (1) 


All we have to do is to plug this into Einstein’s field equation and watch the ripples. 


* Easy for me to say that now! Einstein famously clashed in 1936 with the editor of Physical Review and an 
anonymous referee over the existence of gravitational waves, even though he himself had introduced that notion 
back in 1916. Having moved to the United States, Einstein submitted a paper, together with his new American 
assistant Nathan Rosen, to Physical Review, claiming that gravitational waves did not exist. The grand old man, 
not used to having his papers rejected, wrote back angrily, vowing never to submit a paper to that journal again. 
The referee was later revealed to be H. P. Robertson.” 
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Weak field and harmonic gauge 


By making a coordinate transformation x" — x/" = x + e(x), we can simplify (1) con- 
siderably. Let 0,,¢,, be small, of the same order as h,,,,, so that ae = 5p SiG 2e)e se O(e?). 


: . / is ies ax ax? 3 
The coordinate transformation g po * ) = Suv (%) 5276 a7 then reduces? to 


Hy = Myw — Ipby — WE (2) 


py Ow vou 


Notice the structural similarity to the electromagnetic gauge transformation A’, = A,, — 
0,,A. Very nice! More on this in the next chapter. In electromagnetism, by choosing A, 
we can fix a gauge, typically the Lorenz gauge‘ a, A“ = 0, to simplify Maxwell’s equations. 
Conceptually, we have the same situation here. 

Indeed, let’s exploit the freedom in (2) to impose the so-called harmonic or de Donder 
gauge condition 


a, = 5a,h (3) 


where we define the trace h = n"h,,, = hj. 

To see that this is always possible with a judicious choice of e“, simply use (2) to compute 
(hy - 59, h’) ot ey ai 59, h) — de, (where we dropped second order terms, while 
noting that a, = 0, — (0,€*)0,). Thus, if somebody gives you an h' that does not satisfy 
(3), you can always choose an ¢,, namely a solution of the equation de, = (0,,h — 50,h), 
so that h’ satisfies (3) to the order considered. Then drop the prime. Note that to this 
order, we raise and lower indices with the Minkowski metric 7. 


Degrees of polarizations in a gravitational wave 


We started out with a symmetric tensor h,,,, with 4 - 5/2 = 10 components. After imposing 
the 4 conditions in (3), we are left with 10 — 4 = 6 components. The key point here is to 
realize that, even with h,,,, satisfying the harmonic gauge, we can still make a “residual” 
transformation (2). Since 4,,h’/# = 4,,h4 — 472, — d,(9-€) and 58,h’ = 34, (h — 20 - e), we 
have (0,,h'/ — 59,h') = (0,A% — 59,h) — de,, and we see that, as long as 07, = 0, the 
harmonic gauge (3) will continue to be satisfied. This brings the number of physical 
degrees of freedom in h,,, down to 10-4-—4=2. 

It is instructive to compare the familiar story in electromagnetism. By a gauge transfor- 
mation, we can impose the Lorenz gauge so that Maxwell’s equations 0” F,,,, = 0"(0,A, — 
0,A,,) =0 simplify to 07A, = 0. We can make a “residual” gauge transformation satisfy- 
ing 07A = 0 and stay within the Lorenz gauge. Thus, the number of physical degrees of 
freedom in the electromagnetic field A,, is 4 — 1 — 1= 2. You know very well that electro- 
magnetic waves come in two polarizations. 

Inspecting (1), we see that the harmonic gauge is designed to knock off the last 3 terms, 
so that Ry = — 5 a*h jv: In the weak field or linear approximation, Einstein’s field equation 
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Reig 58 uvR = +87 GT,,, reduces to* 

I Ayy — FNypyd?h = —162GT,,, (4) 
or equivalently 

Ih yy = —164G (Ty — 3MuvT) (5) 
with T = n"T,,,. 

In vacuo, this simplifies further to 


q2 
2 2 
8*hyy = (-5 +V ) hyy =0 (6) 


which we recognize as the standard relativistic wave equation. (The reader with a long 
memory may recall that we already encountered this in chapter II.3.) Since the equation 
is linear, the general solution can be constructed as a linear superposition of plane waves! 
Ay y(X) = €yy(k) sin(k « x), with k - x = ,)k"x” and k? = 0 as required by (6). The polar- 
ization tensor ¢,,, = &,,, (not to be confused with ¢,,!) satisfies k,,e = 5k, (with e = &?), 
the Fourier transform version of (3). 

We now use the 4 degrees of freedom embodied in the residual transformation as 
explained above to impose 4 more conditions: €9; = 0, for i = 1, 2, 3, and e = 0. (You 
should check that this is indeed possible!) With the latter condition, the harmonic condition 
collapses to k“e,,,, = 0. The gauge defined by these 3 + 1+ 4 = 8 conditions is known as 
the transverse-traceless or TT gauge. 

We can now check that gravitational waves do indeed come in only 2 polarizations, 
just like electromagnetic waves. Let the wave propagate along the third axis so that k¥ = 
w(1, 0, 0, 1) (recall that r= 0). The harmonic condition k¥e,, =0 implies 


Eqy = 3) > £00 = 630 and €3;=6;=0, fori =1,2,3 (7) 


Thus, £33 = €32 = €3; = 0. By the symmetry of ¢,,,,, we have £39 = &93 = 0, which implies 
£00 = £30 = 0. The traceless condition e+ = 0 then collapses to ¢; + €7) = 0. 

You should check that this collection of simple but somewhat confusing statements we 
derived merely says that the zeroth and third rows and columns of the symmetric traceless 


matrix €,,, vanish. Thus, the polarization tensor 
0 0 0 0 
Oe ge O 
Eyy = (8) 
Oe —e, 0 
0 0 0 0 


is characterized by 2 real numbers ¢, and ¢,,, describing two independent degrees of 
polarizations. When we quantize electromagnetic waves, we obtain photons.* Similarly, 


* There is a hidden subtlety to this equation that will be made clear in exercise 3. 

1 The more sophisticated reader recognizes that more generally, we can also include a cos(k - x) wave, or even 
better, write hy) (*) = &yy (kik, 

* As shown in any textbook on quantum field theory. 
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when we quantize gravitational waves, we obtain gravitons. The 2 polarizations, found 
here for a classical gravitational wave, correspond to the 2 helicity states of the graviton in 
quantum field theory.” 


Detection of gravitational waves 


To understand what the “plus” and “cross” polarizations ¢, and ¢,, mean physically, we 
allow a gravitational wave to wash over a cloud of particles. 

It is instructive to consider first a single particle initially at rest, that is, with 4-velocity 
Vv" = (1, 0, 0, 0). A gravitational field will cause it to move according to 


dv?’ 
p kyVv — 
ata ae 
Thus, at the initial instant, ae = —If). Now compute 


1 
es ria (Alo, + Aho, — 9,00) 


but this vanishes in the TT gauge in which 9, = &,9 = 0 (see (8)). Thus, a moment of proper 
time later, V“ is still (1, 0, 0, 0). Repeating the argument, we see that a single particle 
initially at rest will remain at rest, completely ignoring the passing gravitational wave. 

Confusio looks worried. “Ifa particle does not feel the gravitational wave passing by, is 
the wave real?” 

In fact, this apparently counterintuitive conclusion makes physical sense, since a Rie- 
mannian spacetime is by construction locally flat at any given point. 

Our friend the Smart Experimentalist remarks, “Well, you simply need to go beyond 
a single point, to explore its neighborhood, to detect gravitational waves. So throw in a 
bunch of particles!” 

Consider two particles separated by ¢“ = x} — x} = (0, c). Let’s see if the proper dis- 
tance between them Al = ,/¢2 + hygvigi= Ic|a+ th, joes [C7 + O(h?)) changes. Obvi- 
ously it does if h,,, varies. For a plane wave propagating along the third axis, rocking and 
rolling in the (1-2) plane, we see from (8) that in the TT gauge, we want to set ¢ in the 
(1-2) plane. We read off from (8) the fractional change in physical distance between the 
two particles (up to a factor of 5 and with the unit 3-vector ¢! = ¢'/|C|): 


mbt! ={e, [(B) — (@)'] +20,.8%| sina a9 (9) 


Thus, when a plus polarization wave comes along, a pair of particles separated along the 1- 
axis would move 180° out of phase compared to a pair of particles separated along the 2-axis, 
but would not respond at all to a cross-polarization wave. In contrast, a cross polarization 


wave would excite a pair of particles separated by ¢ « (1, £1, 0), something that a plus 
polarization wave cannot do. 


* The existence of 2 helicity states for massless particles of spin j, be it the photon or the graviton, follows in 
general from the Lorentz group and the CPT theorem. See, for example, S. Weinberg, The Quantum Theory of 
Fields, or QFT Nut, pp. 186 and 446. 
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Figure 1 A ring of particles responding to a gravitational wave propagating perpendicular to the paper: 
(a) a plus wave; (b) a cross wave. 


A few pictures are worth a thousand words, even with some Greek symbols thrown in. 
See figure 1. Panels a and b show how a ring of particles responds to a plus and cross 
polarization wave, respectively, propagating perpendicular to the paper. The pattern is 
characteristic of a tidal force, as was discussed way back in chapter 1.4. 

Gravitational wave detectors use laser interferometry to measure minute shifts in the 
distance between massive objects. Given how feeble a force gravity is, you can imagine 
the engineering feats involved. Several major projects’ are either under way or are being 
planned. Hopefully, gravitational waves will be detected soon. 


Emission of gravitational waves 


Thus far, we have studied the propagation of gravitational waves in vacuo. To study their 
production, we include a source. We are invited by (3) and (4) to define h wv = hy n aos 
so that 


ih,» =—162GT,,, (10) 


pv 


We see that the harmonic gauge condition a”h uv = 9 is consistent with 0“T,,, = 0, as it 
had better be. 

Comparing this with the equation @*A,, = —J,, governing the production of electro- 
magnetic waves, we see that nothing much conceptually new is involved here, merely an 
extra index going along for the ride. The readers who have studied electromagnetism know 
what to do: define a Green’s function (for those readers who don’t know this, I give a brief 
explanation in appendix 2) by 


G(x) = 6 (x) (11) 
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which simply says that G(x) is the (scalar) wave due to a unit point source at the origin of 
spacetime. The solution to (10) is then given by 


figy() = 1626 it dy Gx — y)T (0) (12) 


(plus, trivially, an arbitrary solution of the homogeneous equation, namely (10) with the 
right hand side set to zero). Physically, the Green’s function approach merely says that 
since (10) is linear, we can solve it for a point source, and then, as Christiaan Huygens 
taught us long ago, add up the resulting waves from each point that makes up T,,,. That’s 
precisely what (12) instructs us to do. 
Note that G(x) is exactly the same Green’s function you would use for electromagnetism: 
there are no indices in (11) after all. You may even know it off the top of your head as 
a(t)d(t —r) 
7 Arr 


Gt, x)= (13) 


with r = |x| and the step function 6(t) = 1ift > 0 and 0 otherwise. This expression makes 
perfect sense: the first factor 0(t) tells us that the wave propagates only into the future 
(causality!), the second factor 6(t — r) says that the wave propagates at the speed of light, 
and the third factor (— a) satisfies* V2(— 7) = §3) (x). 

Plugging (13) into (12), we obtain 


> 


y S(t — y? = 1X — HT) 


Ix — yl 


4 s 4 Olt 
iy, =46 fa y 


Tyy(t — |X — 1,9) 

=4G / 7p eS (14) 
Ix — yl 

Note that 7,,,, is evaluated at the retarded time tg(t, x, y) =f — |x — y|. Just as for electro- 

magnetic waves, a gravitational wave reaching x at time rt had to be emitted at y at time tp. 


You plug in whatever T,,, you want into (14) and out pops h,,,, as simple as that. 


pv? 


Multipole expansion and compact source approximation 


We can repeat many of the things we do in electromagnetism, as I said. For example, we 
can expand ro as a Taylor series in y’ and obtain the multipole expansion 

~ = 1 3 * x : 3 i > 

hyy(t, x) =4G . dy Ty (te 9) +5 d’yy Thy (tr, ¥) 


3xixt — r2sti 


3 fe y'yTyy (tr, 3) +> i (15) 


Note that we have not yet expanded the |x — 3| ~r(1— a + ---) lurking inside tp. You 
see how all that stuff you learned about rotational tensors back in part I could be useful. For 
example, the traceless 3-tensor (3x!x/ — r?6‘/) pops up. Recall from chapter VI.3 that the 
analog of Newton’s theorem, namely the Jebsen-Birkhoff theorem, continues to be valid 


* Recall (II.1.12) from way way back. 
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in Einstein gravity. Around a spherically symmetric static (that is, time independent) mass 
distribution, only the first term in (15) survives. 

If r is much larger than the size of the source, as is typically the case, we need to 
keep only the first term in this expansion, so that BAe, x)= 46 f dy Tyy(tr, Y). For an 
astrophysical source, [ d?y T° (tp, ¥) = M isits energy or mass, and f d?y T™ (tp, ¥) = P! 
its momentum. If the source is not moving relative to the observer, so that P' = 0, then 
we have simply hog = tau and h°/ = 0, as expected. 


Similarly, h;; is given in terms of f[ d*y T; (tr, y). Since our intuition about T;; is 


iy 
rather weak, as was discussed in chapter III.6, it is preferable to use the conservation 
law 0,7“” = 0 to eliminate T“/ in favor of T®. As an exercise, you can derive 
+i, « 26dOUCt 
Mews c 0 Ze 
r dt? 


with Q(t) = f dy y'y/T(z, ¥) the quadrupole moment of the source. 


(16) 


t=tp 


Weak field around gravitating sources 


We are now able to work out the spacetime around a gravitating source in the weak field 
limit. Consider stationary sources. Recall from chapter VII.5 on rotating black holes that 
a static source is not moving at all, while a stationary source could be moving, but without 
changing in time, for example a mass distribution rotating at constant angular velocity. 
With T“” in (14) not changing in time, we obtain immediately 


; TO 
fiyy(@) = 4G / diy oe (17) 
|x — y| 


No need to mess with retardation! The expansion in (15) is then a true multipole expansion. 

Furthermore, if we have a nonrelativistic stationary source so that the typical velocity v 
of its components is much less than c, then recall from chapter III.6 that T™ and T'/ are 
respectively a factor of v/c and v*/c? down compared to T. So the leading term is hoo, 
followed by fy; and then jj. 

Again, for r = |x| much larger than the size of the source, we obtain from (17) 
hoo X —4®, with 

e@=-5 f &y roG=-S (18) 
the good old Newtonian potential. You might worry about the factor of 4 in hoy = —4®, 
but notice that we still have to convert to h,,,,. Since h= nha, = —n"h,, = —h, we have 
h=—h ~ hoy = —4®. Then hyy = Pass - 5Nuvh, and so 


s “5 : 1e 
hgg = hyp + Sh = 40 + 20 = 2, yy = hn 5h =0-20=-20 


and so on.* Thus, we finally obtain, within the approximations made, the perturbed 
spacetime metric 


ds” ~ (Quy + Ayy)dx"dx” ~ —(1+ 2@)dt? + (1 — 2)? + 2gjdtdx! (19) 


* Note that a common error is to suppose that hog >> ho; > h; ; implies that the same holds for h,,,,. 
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For a static source, Tp; and hence ho; vanish, and we obtain a form which the Schwarz- 
schild metric must, and does, satisfy to this order. See also exercise 3. 

It is worthwhile clarifying a point some texts appear to be confused about. When 
we look at the asymptotic behavior of gog ~ —(1 — 2GM) to determine the mass of the 
Schwarzschild and of the Kerr black hole, we are not using the weak field approximation 
of this chapter. Far away from the black hole, where the observer sensing the asymptotic 
behavior is located, the gravitational field is indeed weak. But the field is definitely not 
weak near or inside the black hole, where h,,,, is in fact of O(1). Schematically, we can write 
the full Einstein field equation as 8°h ~ GT + (hd*h + hdhdh + - - -) with the expression 
in the parentheses, together with the 42h on the left hand side, equal to the infinite 
series expansion of R,,, — 58uvR. In other words, the mass M in gog ~ —(1 — 2GM), as 
deduced by the hard-working astronomer measuring orbits (as explained by the Smart 
Experimentalist in chapter VII.5), is given by something like 


fos {r + (nan 4 hahoh +-- :) /G\ 


according to the theory. The gravitational self-interaction is automatically included. (See 
also the discussion in chapter VII.4.) We will explore this important conceptual point 
further in appendix 3. 


Slowly rotating bodies 


What about ho;? As discussed in chapter VII.5 on the Kerr black hole, a term g,, in the 
metric indicates that the source is rotating. 

Let us consider a slowly rotating body whose energy momentum tensor is dominated 
by T“” ~ pU"U", with p the mass density and with the pressure term negligible. Let the 
rotation be about the z-axis, and let the angular velocity be w, so that U9 ~ 1>> U', U* = 
—wy, UY =+ox,and U’ = 0. (Confusing notation alert: we mix up two different notations 
x = (x!, x*, x3) = (x, y, z) here for the sake of clarity!) The total angular momentum is 
then J = f d?xp(X)(xU? — yU*) =o f d3xp(X)(x* + y?). 

Next, evaluate ho,(x) = —4G f d3x’ a by plugging in T°'(x’) ~ —pwy’ (again, no- 


2, x3) = (x’, y’, z’)) and expanding 


tation alert: x’ = (x”, x’ 


— x! 


yoy =n (143-277 +--+) 
We obtain 
ho, = hy, ~ —(4G/r) / dx! poy! (¥-#'/r?) 
== (4G/r°) y / Bx! pw (4 rs y?) jz=- (2G4/r3) y (20) 


where we have invoked rotational invariance around the third axis (you've probably 
done similar calculations in electromagnetism). We obtain hg;dx! = (2GJ/r3)(x!dx? — 
x?dx!) = —(2GJ/r) sin* dg, in agreement with what we had in chapter VII.5. 
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Indeed, we are now able to evaluate the angular momentum of the Kerr black hole in 
the limit of slow rotation. Expand the Kerr metric (VII.5.15) for r >> rg, a: 


- 2 
ds? ~ (: 8) ar Pest a: 
r 


r 


i 4 7) dr? +r? (a0? + sin? ede’) (21) 
Comparing (20) and (21), we deduce that rsa = 2G J. Thus, we have verified, as promised, 
that the definition J = Ma given in chapter VII.5 reduces in the small a limit to our usual 
understanding of angular momentum. 

People often define ho; = —4A', with a notation intentionally suggestive of electro- 
magnetism: if we think of GT™ and GT as the charge and current density, respectively, 
then ® and A’ correspond to the scalar and vector potentials of electromagnetism, respec- 
tively. Some physicists rightfully call the field V x A the gravitomagnetic field, produced 
by a mass current, just like the magnetic field produced by a charge current. 


Quadrupole radiation 


Back in chapter VI.3, I mentioned the Jebsen-Birkhofftheorem and its analogy to Newton’s 
two superb theorems. You might have been slightly puzzled. Outside a pulsating spher- 
ically symmetric mass distribution, the spacetime remains stubbornly Schwarzschild, 
heedless of the pulsation. In Newtonian gravity, the absence of gravitational waves means 
that the pulsation does not result in any radiation, but in Einstein gravity, it would 
seem at first sight that the gravitational wave can communicate the pulsation to the 
spacetime outside. But now we understand why it can’t. We learned from (16) that, far 
away from the source, the gravitational wave is generated by the quadrupole moment 
Q(t) = f d*y y'y/T™(t, ¥), anda spherical symmetric mass distribution simply doesn’t 
have one. The classical statement that only quadrupole and higher moments can generate 
gravitational waves corresponds to the quantum statement that the graviton carries spin 2. 
You are probably familiar with the analogous statements in electromagnetism. A pulsating 
spherically symmetric charge distribution also cannot radiate. The classical statement that 
only dipole and higher moments can generate electromagnetic waves corresponds to the 
quantum statement that the photon carries spin 1. 


Gravity is nonlinear 


I have pointed out that much of what we learned about electromagnetic waves can be 
taken over for gravitational waves, but we must keep in mind that gravity is fundamentally 
different from electromagnetism. While Maxwell theory is linear, Einstein gravity is highly 
nonlinear, as we discussed after (19) and will emphasize in the next chapter. Within the 
linear approximation used in this chapter, things are simple since we are in a Minkowskian 
background. But beyond this approximation, the background will feel the energy and 
momentum of the gravitational wave and will deviate from Minkowskian. We will have to 
take into account the curvature of the background, and then the calculation of gravitational 
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wave propagation becomes significantly more involved. A detailed treatment that does full 
justice to the subject is beyond the scope of this introductory text. For more, see review 
articles and more advanced texts. 


Appendix 1: Determining the weak field action without Riemann 


Time for another extragalactic fable. The smart young physicist that is you has been thinking about gravity along 
the same line as that Einstein in a civilization far far away. You understand that gravity is mediated by a tensor 
field h,,, describing small deviations of the metric g,,,, from the Minkowski metric ,,,,. You also realize the 
importance of coordinate transformation for a theory of gravity, and you get as far as understanding that the 
action must not change when h,,,, changes by 6h, = 0,8, + O,€,- 

Unfortunately for you, your civilization has not yet produced a Riemann, and you have no idea how to construct 
an action out of g,,,,. 

What to do? You can still construct an action in the weak field regime. First, list all possible terms quadratic 


in h and quadratic in 0. Lorentz invariance tells us that there are 4 possible terms: 
S= / d‘x (ad;h!9* yy + ba hia*h? + c,h’! Iyy + dhi-3"9"hyy) (22) 


with 4 unknown constants a, b, c, and d. Note that we are talking about an action in Minkowski spacetime here. 
(Io see that these are the only terms, first write down terms with the indices on the two @ matching, then the 
terms with the index on a 0 matching an index on an h, and so on.) 

Now impose your invariance requirement. Vary S with 6h,,, = —(0,, + d,€,,), integrating by parts freely. For 
example, 8(0, nH O*hyy) = —2(8,(20"))(O* hy) “=? 46° IMA yy. Since there are 3 objects linear in A, linear in 
ge, and cubic in d (namely ¢”920,h and e” 4,0*0"h,,, in addition to the one already displayed), the condition 5S = 0 
gives 3 equations, just enough to fix the action up to an overall constant, corresponding to Newton's constant. 
You work out, by high school algebra, that the combination 


T= 50h hyy — 5d, hNa"hY — Oh Mh, + IHL“ h yy (23) 


is invariant. 
You triumphantly publish a new theory of gravity* with the action 


Syour name here ~ / d*x ( =! sys) 

with the coefficient of Z fixed by comparing with Newtonian gravity. Indeed, the same story could have been 
told in our civilization. Without a little help from his friends, Einstein might have never heard of Riemannian 
geometry. 

In other words, even if we had never heard of the Einstein-Hilbert action, we can still determine the action 
for gravity in the weak field limit by requiring the action to be invariant under the transformation (2), hardly 
surprising, since coordinate invariance determines the Einstein-Hilbert action uniquely.’ Still, it is nice to 
construct gravity from scratch. 

Note that invariance of Syour name here UNder 5h,,, = 9,€) + dé, also tells us that the tensor T“” that h,,, 
couples to must satisfy 3,,7"” = 0. 

We have already noted that the electromagnetic gauge transformation A/, = A,, — 4, A is structurally similar. 


Using this observation in chapter IV.2, we immediately constructed the invariant field strength F,,, = 4,,A, — 
0,A,,, which we squared to obtain the action. But suppose we didn’t know about F,,,. Following the same 
procedure here, we list the two possible terms quadratic in A and quadratic in 0 allowed by Lorentz and form the 
combination L = ad"A"9,, A, + bd“ A"9,A,,. The requirement that this is invariant under gauge transformation 
fixes b = —a, so that L = ad"A"(d, A, — 0,A,) = ad" A" Fy = jaF PF 


uv: Out pop F,,,, and the Maxwell action, 
not just in some weak field approximation, but in their full splendor. 


* This discussion also connects with that in the next chapter. 
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Even though the following is saahtly out of the scope of this text, I cannot resist remarking on how the 
harmonic gauge condition 0¢h,,, — 5 +3°h* = 0 is imposed in quantum field theory. We add the square of the left 
hand side ("hy — 5a"ht)? tol a observe, in clever satisfaction, that this knocks off the last two terms in TZ, 
so that the weak field action effectively becomes 


1 
4A 1 Xa 1 Xr 
Rilatea ces i d*x} ls (a,hi0 hyy — 49,h9 h) + ht | (24) 


This crucial step allows us to construct the Feynman propagator for the graviton.!° If you knew just a tiny bit of 
field theory, you could now derive Einstein’s famous calculation of the deflection of light directly™ from Sweak field 
without having to say a word about Riemann or curvature. 


Appendix 2: Green’s function 


As promised, I briefly explain Green’s function for those readers who have never heard of it. Others can skip this 
appendix. 
First, the poor man’s approach. Recall from chapter II.1 that the solution of 


V7G(X) = 8) (25) 
is G(X) = —k. We want to solve 
G(x) = 5 (x) (26) 


Away from the origin, oe equation becomes, in spherical coordinates, ag } J; a (r2 ag) = 0. Writing G = 


g(t,r)/r, we obtain (-2 az t = &)s= = 0, which, as we recall from chapter II.3, has the solution g(t, r) = 


Sout —1) + fin(@t +7), with fe and fj, two arbitrary functions corresponding to outgoing and incoming 
spherical waves. Physically, we keep only the outgoing piece, so that G= f(t —r)/r, with some unknown 
function f. 

But so far, we have only solved (26) without the delta function source, you protest—of course it’s easy! 
The solution G = f(t —r)/r is only valid for r > 0. Now the poor man makes a clever observation. Evaluate 
a°G = d°( f(t —r)/r) and take the limit r > 0. In this limit, a spatial derivative hitting 1/r makes the result 
more singular but a time derivative hitting f (t — r) does not. Hence, we can drop the time derivatives in 9? and 
write, asr > 0, 


G(x) = (-2 + V2) Gi) > FOV* (27) 


If we choose f(t) = —6(t)/(4sr) and use the known solution of (25), then this becomes 8(t)5® (x) as desired. 
It follows that G = —4(t — r)/(4zr). We can multiply this expression by 0(t) for free, since it vanishes for t < 0 
anyway, as r > 0 by definition. This gives the result cited in the text. 

The rich man recognizes that (25) and (26) are members of a large class of linear equations easily solved 


by Fourier analysis. He or she also knows the Fourier representation of the (1-dimensional) delta function 
d(x) = f #e'**, Thus, (25) and (26) are immediately solved by G = — f oh, as andG=-— [ oh, ot, re , te 
spectively, as you can see by formally plugging the appropriate expression into (25) and (26). But there is a 
huge difference between these two integrals, hidden by the highly compact notation. The expression k? in 
the denominator is evaluated with the Euclidean metric for (25) and thus is equal to 5;;k/k/ = k*, but it is 


evaluated with the Minkowski metric for (26) and thus is equal to n,,,k“k” = —(k®)? + 2. In the first case, 


the integral over d*k can be done without too much difficulty and reproduces the result G() = zz. In 
the second case, in integrating over k°, we have to specify what to do with the poles at k° = +|k|. For a de- 


tailed explanation, see any book on mathematical physics or field theory.’* It turns out that the correct pro- 
cedure is to interpret k? as —(k°)? + Re tie sign(k°), with the sign function defined as usual and with ¢ a 
positive infinitesimal to be set to 0 after the integral has been done. This procedure reproduces the result 
G=—0(t1)3(t —r)/(4ar). 

My advice for those not that familiar with the preceding is to simply go with the poor man’s approach.!3 
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Appendix 3: The gravitational field far from a possibly strong 
stationary source 


This appendix is rather involved and can be skipped upon a first reading of this book. 

As I emphasized in the text, the discussion in this chapter assumes that the gravitational field is weak 
everywhere. Here we want to study the gravitational field far from a strong source, say, a black hole. Near the 
black hole, the gravitational field is anything but weak and h pv is of order unity, so it would seem that linearized 
gravity would not have much to say. But the point is that, for an observer located sufficiently far away from the 
black hole, the local gravitational field she measures is weak enough for linearized gravity to hold. Far away, we 
can still exploit the field equation 07h pv = Oand the harmonic gauge condition aHh pv = 0. We assume in addition 
that the source is stationary, so that the field equation reduces to 


Vin, =0 (28) 


We now show that general considerations, based on rotational invariance, time reversal, parity, and so on, 
severely restrict the possible form of h,,,,. The goal is now to determine hoo, ho; hj; ; using general principles. Given 
that the metric does not depend on time, our manipulations will be strictly ied to everyday 3-dimensional 
quantities (as you will see). Hence we can suspend temporarily, for ease of writing, the distinction between upper 
and lower indices. Or, if you prefer, we raise and lower spatial indices with the Kronecker delta, without any funny 
signs. You will see what I mean. 

A preliminary remark. Starting with the solution 1/r of Laplace’s equation, namely V7(1/r) = 0, we can 
generate more solutions simply by differentiating, since 


3x!xl — blip? 


a;(/r) = —x'/r3, 8,8;(/r) = ; 
= 


and so on all solve Laplace’s equation. 

The subsequent discussion will be in two parts. The second part is more general than the first part. 

First, we assume that our source defines an angular momentum vector J. We now go way back to the 
discussion of rotational invariance in chapter 1.3 to construct /og, hg;, and h; ; out of what we have available, 
namely the vectors ¥ and Fs 

A clarifying word or two about time reversal and parity. Under pa reversal, f > —t,¥ > x, J > —J.To leave 
ds? invariant, we must have hog > hoo, ho; > sho, and hij ;- That’s obvious enough. Parity is normally 
defined as the operation t > t, x > —X, but since x > —x is ut to a reflection in a mirror placed in the (y-z) 
plane (x, y, z) > (—x, y, z) followed by a rotation, we can equivalently think of mirror reflection. Think of an 
object rotating around the z-axis, and you can see that mirror reflection takes J>-—J. 

So, write down the most general form consistent with these symmetries and impose the gauge condition. The 
result is 


hiy= (5) (29) 


To see how this goes, take /o;, for instance. Let J point in the z-direction. Time reversal flips ho; and hence /ho;. 
Thus, fio; must be odd in J. Similarly, for reflection as defined above to leave ds? = (+ + 2(Agdtdx + hg,dtdy + 
ho3dtdz) + -- +) invariant, we must flip the sign of fy, and hence /o,, but leave hg, and hg; alone. This mandates 
the appearance of the 3-dimensional antisymmetric symbol and fixes the form of ho; up to an overall constant. 

It is worth emphasizing that assuming rotational symmetry is not the same as assuming spherical symmetry, 
which is broken explicitly by the presence of the vector J. 

Second, we will be more general and not assume that the source provides a vector J around which we have 
cylindrical symmetry. The most general form that solves Laplace’s equation we can write down is then 
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~ A Alx! 1 
hoy = + + O 
ee r3 (3) 
> Bi Bi xd 1 
ho; = t z t o( ) 
r r3 3 


2 Ci Clik yk 1 
hija 4S -0(5) (30) 


(the one assumption, I remind you, is that d9g,,, = 0). The various unknown rotational tensors characteristic of 
the source have obvious symmetry properties, for example C4! = Ci'*, 
Now that we have taken care of Einstein’s field equation, we impose the harmonic gauge condition 4, = 0. 


my ~ iyi i 5] oe 2 RP F a ii 
First, for v = 0, the condition 0 a, hy jhg = 2S + SOE | o( 1) implies that B'x! = 0 
r 


r r 


and B‘/ (5/2 — 3x'x/) = 0. The first condition gives immediately B' =0. To solve the second condition, we 
triumphantly use what we learned in chapter I.3 about rotational tensors. Decompose B’’ into a traceless 
symmetric tensor, a trace, and an antisymmetric tensor. The stated condition knocks out the traceless symmetric 
tensor, so that B’/ = BS‘/ + 2e‘/‘ J* with some unknown scalar B and vector J (which will turn out to be the 
angular momentum vector; let’s not be coy about it). 

Onward soldiers! Setting v =i in the gauge condition, we find that 0 = ah," =0 hj ; gives C'/ = 0 and 
CuK (85k? — 3xJx*) = 0. Now, in this round, solving the second equation really challenges us to show off 
our mastery of chapter I.3. We have a 3-indexed tensor C’/* symmetric in the first two indices. We deal with 
these two indices in analogy with how we dealt with B’/ in the preceding paragraph. First, take out the trace: 
Ciik — Ck 4 gi DK, where Cil* is symmetric and traceless in its first two indices, so that i Ck — 0, Readers 
conversant with group theory would now recognize that the first two indices on C’/* transform like a5 of SO (3), 
and the third index like a 3, so that our problem is solved by the decomposition 5 x 3 = 3+ 5 + 7. Those who know 
quantum mechanics would know furthermore that this is the problem of combining an angular momentum 2 
object and an angular momentum 1 object. Those familiar with neither group theory nor quantum mechanics, 
fear not! You can simply contemplate the decomposition 


Cit = (Bist 4 Fike) + Ge j) + Gi (31) 


where F'" and G‘/* are both totally symmetric and traceless. In other words, if we contract any two indices on 
Gi‘ with the Kronecker delta, we get zero. 

Let us count to provide one check. The first two indices on Ci take on 5 different values, while the third 
index takes on 3 different values, giving us 5 - 3 = 15 independent components. On the other side of the ledger, 
E' contains 3 components, and F’’ 5 components. How many components does G‘/* have? Readers with 
an elephant’s memory would remember that we counted the number of symmetric triplets of indices back 
in chapter 1.6: ~D(D + 1)(D + 2) if each index can take on D values; for D = 3, we have iG -4-5)=10. 
Remembering the three traceless conditions Giiks‘7 = 0, we conclude that G‘/* has 7 components. Indeed, 
15=3+5+7; this is “exactly the same” equation as what we had written above, except for the conceptual fact 
that the earlier equation refers to representations of SO(3). 

We should not forget that all this work is needed to solve C kg iky2 — 3xJ/x*) = 0, which now becomes 
(6 DK + Etssk 4 Esstk 4 Fitgikh 4 Pihgikh 4 Gtiky(gikp2 _ 3xJ x’) = 0, from which we obtain D! = —E', 
Fi = 0, and Gk =0, 

Let’s take stock and summarize what we have wrought thus far: 


es A Alx! 1 
hoo = t es + O ( ) 
r r3 r3 


g _ Bal tdexi Te | (A 
= 


3 


(54 Dkx* — Dixi — Dix’) 1 
i t O 


r3 rs 
You might be disappointed that after all this work we are still left with quite a mess. But in Einstein gravity, 


we have yet another trick up our sleeves: we can perform a coordinate transformation, otherwise known as a 
gauge transformation. From (2), we are free to change h,,,, > Ay =hyy — OnE + Ep — Nyy + €). Choose 


&, = (B/r, 0, 0, 0) and we can knock off the B term in ho;. Next, choose €9 = 0 and ¢; = —D'/r to knock all 


Dix! 
ae 


3 D terms from hij. But oops, you might worry, because in this step hoo > hbo = hoo + 0:8; = hoo 4 and 
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D! pops its ugly head up somewhere else. Again, fear not, it can be absorbed into the A! term. Finally, since 
HI =} (1 = as +o: ‘, we can knock off the A‘ term in hog by suitably translating our coordinate system. 
The bottom line is that, quite remarkably, the far field of any gravitating system, as long as it is stationary, 


whether a black hole or a collection of rocks, can be beaten down from (30) to 


> A 1 
y= 4 +0 (5) 
r r 
z Qetik yi Jk 1 
ce 0G) 
r r 


hj; =O (5) (32) 


The far field depends on a number and a vector, just as in (29), even though we made fewer assumptions. Of 
course, it also agrees with the multipole expansion (15) and (17), discussed in the text; the quadrupole and higher 


moments, which depend on the shape of the mass distribution, are hidden precisely in the O ( 4) terms in (32). 


Our final task is to convert to the more physically relevant h,, = h —— 5 n uvh. In a calculation understood to 


be accurate up to but not including o(3) terms, and similar to one performed in the text, we have h=-A I‘; 
then hog = A/(2r) and h;; = 6;;A/(2r), leading to the metric* 
ds? = (: at) dt? 4 a (xdy — ydx)dt 4 (: ut) (ax? + dy? 4 dz”) (33) 
r r r 


(with the z-axis defined by the direction of J ). We have identified A = 4M. The vector J as it appears naturally 
here, in the tensor decomposition of B'/, should be regarded as the definition of angular momentum. As | 
emphasized in chapter VIL.5, it may be used to define the angular momentum of a rotating black hole. As shown 
by (20), it reduces to what we mean by angular momentum for a slowly rotating object. 


Exercises 


1 Use the equation of geodesic deviation derived in chapter IX.3 to describe the behavior of a pair of particles 
when a gravitational wave passes by. Recover the result derived in the text. Hint: Use the TT gauge. To leading 
order, everything simplifies. 


2 Derive the quadrupole formula (16). Hint: Integrate by parts to obtain 
/ dy ylaT (t, 3) =— / dy T (t, 9) + / dy af yirm* (t, 3)| 


where the second integral on the right hand side can be converted by Gauss’s theorem to a surface integral. If 
we integrate over a large enough region enclosing the source, then T“* = 0 over the surface and the surface 
integral can be dropped. We thus obtain the identity 


[es TY (t, 3) =— / dy yla,T"* (t, 3) 
Use this identity and energy momentum conservation 0,7/” = 0 (that is, IgT“° + a,7"* = 0). 


3 Solve (10) far outside a static spherically symmetric mass distribution. You should be able to recover the 
asymptotic (that is, large r) form of the Schwarzschild metric. By the way, you could determine the answer 
by invoking some of the general results in the text, but that is not the point of the exercise. After all, we 
already know the Schwarzschild metric; the point is to see how its asymptotic behavior can be obtained 
directly from (10). 


* Note that this metric for J = 0 does not reduce to the far field of the usual Schwarzschild solution, but of 
the solution written in the form given in exercise VI.3.3. 


N 


IX.4. Linearized Gravity and Gravitational Waves | 577 


otes 


. M. Bartusiak, “Einstein’s Unfinished Symphony,” 2000. 
. See D. Kennefick, “Einstein versus the Physical Review,” Physics Today, September 2005, p. 43. 


. Some students are rightfully concerned about the meaning of the expansion. The fastidious may want to 
. : axl ax? 
write 8,44 = Nyy + Ah,y and x" > x'* =x + de (x), expand equations such as ie (x’) = gyy(x) a oe 


a series in A, and equate powers of A. 


4. See QFT Nut, chapters II.7 and III.4, particularly the footnote on p. 144 about Lorenz versus Lorentz. 


wm 


. Hence the name harmonic. 

. Historically, a confusing debate on whether gravitational waves could be removed by a coordinate transfor- 
mation went on for some time. See the first footnote in this chapter. One subtlety is that if we consider a 
plane wave, as in standard treatments of electromagnetism, the infinite amount of energy contained in the 
wave would curl up spacetime, so localized wave packets must be used. We ignore all such subtleties in this 
introductory text. 

. At the moment, we have LIGO (short for the Laser Interferometer Gravitational Wave Observatory) in the 
United States, VIRGO in Italy, and GEO in Germany. Since the reader can easily find a list of projects on 
the web, I refrain from giving a more complete list that may be outdated soon. For example, in an earlier 
draft of this chapter, I had mentioned LISA (the Laser Interferometer Space Antenna) consisting of three 
spacecraft in orbits around the sun, but currently it is not funded, and even the proof-of-concept mission, 
LISA Pathfinder, is not scheduled until 2014. 

. The mathematical correspondence can be pursued further. With the correspondence hog ~ ®, ho; ~ Aj, that 
is, ho, ~~ A, you can work out Einstein’s field equation in the post-Newtonian approximation and show 
that it has exactly the same form as Maxwell’s equation. (Indeed, you can see that the leading Newtonian 
approximation V7 ~ p is just Coulomb’s law; generations of students have probably noticed that the partial 
differential equation for the gravitational potential and the electrostatic potential are identical in form.) You 
could then go on and indulge in some rather far-out speculations. For example, since we can add an as-yet- 
unobserved magnetic monopole to Maxwell’s equation, we could ask if there is a gravitational analog of the 
magnetic monopole in Einstein gravity. See A. Zee, Phys. Rev. Lett. 55 (1985), p. 2379. 

. We see explicitly that the action contains two powers of 9, and so the cosmological term is excluded. For the 
same reason, the higher derivative terms that we will discuss in chapter X.3 are also excluded. 

. For more, see chapter VIII.1 in QFT Nut. 

. See QFT Nut, p. 439. Indeed, this is essentially how Feynman does it. 

. For example, see QFT Nut, pp. 23-24. 

. The “poor man” I followed here is Landau. While he may not be rigorous enough for the jungle patrol on 
the Amazon, he is plenty rigorous for me. 


IX.5 A Road Less Traveled 


“So great an absurdity” 


Many roads lead to Einstein gravity. Back in chapter VI.1, I “air lifted” you over one of the 
shortest ways I know of to Einstein’s celebrated field equation. Here I show you a road less 
traveled. 

In chapters II.1 and II.3, I reminded you that in Newtonian gravity, the gravitational 
potential ® is determined in terms of the mass distribution p by 


V20(x, 1) = 4 Go(k, t) (1) 


Look at this equation: any change in the mass distribution will be instantaneously com- 
municated to the Newtonian potential. The gravitational potential ® is slavishly yoked to 
the matter distribution. 

Newton himself worried about this action at a distance. How could a planet know 
instantly any change in the position of its star? In the Principia, he left* this conundrum “to 
the consideration of the reader.” But he did fret, and in a 1693 letter to his friend Richard 
Bentley, he opined: 


That gravity should be innate, inherent and essential to matter so that one body may act upon 
another at a distance through a vacuum without the mediation of anything else by and through 
which their action or force may be conveyed from one to another is to me so great an absurdity 
that I believe no man who has in philosophical matters any competent faculty of thinking can 


ever fall into it.) 


Tell me, when you first learned about the inverse square law, did you not find it bizarre? 
Would Newton have described you as lacking in “faculty of thinking”? 


* There is perhaps a lesson here somewhere for the young theoretical physicists reading this book. Newton 
was content to postulate the inverse square law and then explore its consequences. He left its dynamical origin? to 
others like Descartes, whose theory of vortices sweeping the planets along was swept into the dustbin of history. 
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Bringing time to gravity 


By now, with your vast knowledge of Lorentz invariance and of relativistic completion, you 
know that we should bring time into the picture and thus promote V? to a? = —d? + V?, 
so that (1) is promoted to 07® = 41 Gp. 

One important remark is that this modification immediately implies the existence of 
gravitational waves: something has to propagate. In empty spacetime, far from any matter 
distribution, the equation 07®(x) = 0 has the wave solution ®(x) = a cos(wt — kX) + 
b sin(wt — k - X), with w% = k. Indeed, we already encountered gravitational waves in the 
preceding chapter (and in chapter II.3). 

Historically, Laplace did have the foresight and insight to speculate about the speed of 
propagation cg of the effect of gravity. Unfortunately, he concluded erroneously that cc > 
c. These days, particle theorists subscribe to something known as the naturalness dogma? 
(or doctrine if you prefer), saying that fundamental constants with the same dimension 
should have roughly the same order of magnitude.* Otherwise, we would be confronted 
by a “hierarchy problem.” So perhaps nowadays the default view would be that cg ~ c. 
Of course, we now understand that the speed of propagation is a universal constant, a 
property of spacetime rather than the individual interaction. But before this understanding, 
it would seem strange, perhaps even bizarre, that gravitational and electromagnetic waves 
would propagate at precisely the same speed c. Conceivably, some bright young guy in 
another civilization far far away could have proposed the existence of gravitational waves 
with cg = c long before a complete understanding of curved spacetime was established. 

But once you promote the Laplacian to the d’Alembertian 07, you are obliged to also 
promote the mass density p. Here, as Robert Frost* said, we are at a fork faced with two 
roads. As we saw in chapter III.6, by studying how p transforms under a Lorentz boost, 
we would naturally promote p to an energy density T, the time-time component of an 
energy momentum tensor T“”. As we will see, traveling down this road, we will arrive at 
Einstein gravity. 


Traveling down the wrong road 


The Finnish physicist Gunnar Nordstrém (1881-1923) pointed out another possibility, 
namely that p could be promoted to the Lorentz scalar T = Tl = n,,T"”. In the non- 
relativistic limit, since T'/ < T™, —T reduces to T™. So, —T and T™ are both suitable 
role models for p to aspire to grow up into. In either case, we will recover Newtonian 
gravity. In Nordstrom’s theory, the field equation for gravity reads 


0° = —40n GT (2) 


and the gravitational field ®(x) is a Lorentz scalar. 


* More in chapter X.3. 
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Alas for Nordstrém, he chose the wrong road.* Nature does not’ make use of this 
possibility. Incidentally, he did this work® before Einstein formulated his theory of general 
relativity. 

After special relativity was established, which after all was to make mechanics compat- 
ible with electromagnetism and its Lorentz invariance, Einstein was not the only one to 
realize that Newtonian gravity (1) also had to be made compatible with Lorentz invariance. 
Others in the race included Max Abraham (1875-1922) and Gustav Mie (1869-1957). 


A less traveled road to Einstein gravity 


Compared with understanding gravity, the special theory of 
relativity was mere child’s play. 


—A. Einstein writing to Arnold Sommerfeld, 1912 


While p becomes a component of the tensor T“”, the left hand side of 2 = 42Gop, for 
the equation to make sense, is also compelled to be a component of a tensor. We are thus 
forced to promote ®(x) to a symmetric tensor field h“” (x) and write 


Phy = 80GT,, (3) 
where we have defined hog = —2® to agree with our earlier discussion. 
After the preceding chapter, it does not take much to guess that the field h,,,, will 


rather naturally turn out to be the deviation of the metric g,,, =n,, +/,, from the 
Minkowski metric. Thus, once we decide not to wander off with Nordstrom, we are 
practically committed to curved spacetime. Indeed, as soon as we write down (3), we are 
led inexorably to Einstein gravity, as was shown by Stanley Deser, collaborating with David 
Boulware and echoing various earlier and later works of Suraj Gupta, Robert Kraichnan,° 
Richard Feynman,’ Steve Weinberg,® and others. 

I merely sketch how we will end up with Einstein gravity, suppressing indices for clarity. 
The action that would lead to (3) has the form S, ~ f d*x(4 ahah + hT), with the (invis- 
ible) indices contracted by n,,,. Indeed, 65, ~ f d*x(Gahdsh + 5hT) ~ f d*x(—40°h + 
T )5h = 0, giving us Gah ~ T.1 must confess that living the unindexed life has its charms. 


Keep on iterating 


But this action can't be the end of the story. The very fact that the field h,,,,(x) is endowed 
with dynamics—that it can wriggle in spacetime—means that it carries energy and mo- 


* Had Nature chosen this road, you wouldn't have to learn Riemannian geometry to master gravity. 

T This is one of my favorite examples of the need to read with sophistication Einstein’s dictum about making 
physics as simple as possible. Simple does not necessarily mean less math. Nature couldn’t care less about how 
much or how little math you learned in school. 
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mentum. In S,, we include in T“” only the contribution of the matter fields. Here the 
word “matter” is used in the same sense as in chapter VI.4 and includes anything but the 
gravitational field h,,,,. We are now forced to include the contribution of /,,, to T” as well. 

In chapter VI.4, we learned, given an action, how to determine its contribution to T””, 
even if the action is given merely in flat spacetime. We pretend momentarily that 7,,, is 
actually g,,,, we vary g,,, to obtain T“”, and then quickly set g,,,, back to n,,,, doing all this 
in our heads. 

Apply this procedure to the term é dhdh in S,. Since this term carries 6 indices (which we 
have suppressed), there are actually 3 “invisible” ns lurking in this term, in the schematic 
form ~ Snnnahah. Promoting n,,, to g,,, varying g,,,, and then setting g,,,, back to n,,,,, 
we obtain a contribution to T“” of the schematic form annahah, where two indices 
are left dangling to match the yv indices on T“”. In other words, we have to shift 
AT > h(T + annahah). Including this contribution, we are forced to the action S; ~ 
f d*x(Aahah +hT + ghanhah). (Our notation is evidently such that after every step, the 
letter T in the schematic form of the action once again includes only the contribution of 
the matter fields. We also suppress the many ns lurking in the action.) 

You see how the game goes: we keep on iterating. The term Zhahah in S) will now force 
us to include a term of the form ghahah in T. Hence we are led by the nose to the action 
S3~ f dtx(Zahah + hT + Ehahah + Ehahan). 

Before you can shout “Here comes S,!”, you see that the action will iterate to an 
infinite series S ~ [ d*x{4 (ahah + hahah + h*dhdh 4+ ---) +hT}. This much is easy to 
understand and makes sense physically, as I will explain presently. The hard part is to 
show, as Deser and company did, that the series q(ahah + hdhdh + h*dhdh + ---) sums 
to the Lagrangian eV-8 R in Einstein gravity.” 

In other words, the claim is that, upon substituting g,,, =, +/,,, into the Einstein- 
Hilbert Lagrangian and expanding in h, we will obtain the series we found by iterating. 
But in a sense this is hardly surprising, since general covariance forces us to the Einstein- 
Hilbert Lagrangian uniquely (as we saw in chapter VI.1). 

In chapter V.2, we extolled the far-reaching power of the equivalence principle. If we 
know the action governing any interaction in flat Minkowskian spacetime, we immediately 
know the action governing that interaction in curved spacetime, in the presence of gravity: 
all we have to do, we learned, is to promote n,,, to g,,,. But there is one interaction 
we cannot apply this wondrous stupendous trick to, so take back the word “any” in the 
preceding sentence. That very special interaction is the gravitational interaction itself! We 
knew gravity only in Newtonian, not Minkowskian, spacetime. What we have learned in 
this chapter is that if we try to construct the gravitational action in Minkowskian spacetime 
iteratively, we end up with the Einstein-Hilbert action in curved spacetime. 

Instead of working with the action, we could also have worked with the equation of 
motion (3). Our task would then be to find a combination, involving two derivatives and the 
metric, with which to replace 0/,,, on the left hand side, as we had already anticipated way 
way back in part II. Gauss, Riemann, and Ricci solved this highly nontrivial problem for us. 


* To keep things simple, we did not mention that starting with hT, we generate an infinite series that will also 
sum up appropriately. 
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Geometry emerges 


This discussion underlines the importance of the geometrical view of special relativity that 
we have emphasized in part III, in contrast to the view of special relativity as a series of 
apparent paradoxes. We could certainly work through the theory of special relativity apply- 
ing the Lorentz transformation to various apparent paradoxes, without once mentioning 
the word “geometry.” But that would be impoverishing physics. 

Imagine yourself in a galaxy far far away, where people have never heard of Einstein 
gravity. But people know about the Newtonian equation (1) V7(x, t) = 417 Gp(X, t), and 
Lorentz invariance has just been discovered. Suppose you then try to make this equation 
Lorentz invariant, following the road less traveled outlined here, and thus promoting ® to 
h,,,. If you had understood special relativity in terms of the geometry of spacetime (that 
is, understood n,,,dx“dx" as the generalized distance between two nearby points), then 
you would naturally interpret (7, + 4,,,)dx“dx” as an even more generalized distance 
between two nearby points. Geometry of curved spacetime naturally emerges. But the 
poor sap who knows special relativity as a series of paradoxes may have a hard time seeing 
curved spacetime. 


The graviton interacts with itself 


The physics behind the infinite series 4 (ahah + hahah + h*ahdh + - --) is easy to under- 
stand with a minimal knowledge? of quantum field theory. You have no doubt heard, and 
I have already mentioned, that when we quantize the electromagnetic field, we obtain the 
photon, and when we quantize the gravitational field, we obtain the graviton. A huge differ- 
ence between the photon and the graviton is that the photon couples to charged fields, such 
as the electron field, but is not charged itself. In contrast, the graviton couples to anything 
carrying energy and momentum, and since it certainly carries energy and momentum, it 
couples to itself. In the electromagnetic action given in chapter IV.2, the photon couples to 
the charged fields that comprise the electromagnetic current J” via the term A,,J” in the 
action. Similarly, in the Einstein-Hilbert action, the graviton couples to matter fields that 
comprise the energy momentum tensor T“” via the term h,,,T“”. But in addition, there 
are an infinite!° number of terms of the form h - - - hdhdh describing the complicated in- 
teractions of many gravitons with one another. This partly accounts for the intractability 
of quantum gravity at present. 

To say all this in a slightly different way, we recall from chapter VI.4 that T“” is not 
locally conserved, in contrast to the electromagnetic current J“. The physics behind this 
fact is the ability of the gravitational field to exchange energy momentum with T””. 


Appendix 1: From electrostatics to Maxwell 


It is instructive to repeat for electromagnetism what we did in this chapter. Suppose we start with electrostatics, 
with Poisson’s equation V7®(X, t) = p(X, t) determining the electrostatic potential ® in terms of the charge 
density p. Indeed, it is essentially the same equation as (1) that we started this chapter with. 
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Again, we complete relativistically, except that now p is promoted to be the time component of a Lorentz 
vector J” instead of the time-time component of a Lorentz tensor T“”. This forces us to promote © to be the 
time component of a Lorentz vector A”, leading us to the equation 07A“ = — J". The next step is motivated by 
our desire to have current conservation fall out of the equation of motion. This would happen if we change! our 
equation to 0,(0” A” — aA”) = —J", from which 4,,/" = 0 follows as an identity. You might realize that this 
last remark in fact foreshadows energy momentum conservation falling out of the Bianchi identity, as we saw in 
chapter VI.5. 


Appendix 2: Gravity is feeble, and the Planck mass is huge 


As we have noted since the very beginning of this text, the immensity of the Planck mass Mp = re ~ 10! GeV 
directly reflects the extreme feebleness of gravity: G is teeny, so Mp is humongous. This is merely a simple matter 
of high school algebra, but it is rendered particularly clear by the weak field discussion of this chapter. Write the 
action in the text as S ~~ [ d*x{M3(ahah + hahah + h?ahdh +---)+hT}. Now scaleh = h/Mp, so that 


S~ / d*x | (ahah + Mp iahah + Mp7Wah—ah +--+) + Mp ThT| (4) 


The gravitational field f is conventionally normalized in the sense that the “kinetic energy” term dhdh has 
coefficient unity. But now we see that the interaction of h with the rest of the world (as represented by T) and 
with itself scales like inverse powers of Mp, as expected. 


Appendix 3: A shorter road less traveled 


Using the Palatini formalism introduced in chapter VI.5 (which you may wish to review now), we can shorten 
the road less traveled described in this chapter. I am content to do it schematically, omitting the indices. 
A flat spacetime version of Palatini would have started with two fields h.. and T’:, governed by the action 
S~ {(haV + AT), where the appearance of the Minkowski metric when needed is understood. Upon varying 
h, we get 9 ~ T. But we don’t know what I is. To remedy this, add a quadratic term I'T to the action, so that 
S~ [(haP +hT +IT1). Varying l, we obtain lr ~ 0h, which when plugged into the previous equation, leads 
us to 2h~T, namely (3). By the argument in the text, we now have to introduce the cubic term ATT, which 
together with 'T, we then recognize as the first 2 terms in the expansion of ./—gg"T'.T’’,. We thus recover the 
Palatini version of the Einstein-Hilbert action Spy ~ f d*x./—gg(al + IT), namely the Palatini action given in 
chapter VI.5. 


Notes 


a 


. R. Bentley, Works of Richard Bentley, vol. 3, Francis Macpherson, 1838. 

2. I might call the Descartes approach the “all or nothing approach,” which some theoretical physicists still 
indulge in. At any stage in the development of physics, certain questions are not appropriate; for instance, 
somebody could always demand of Newton, “Hey Isaac, so why inverse square?” 

3. We will come back to this idea in chapter X.7. 

4. Written in that famous year 1915, by the way. 

5. For the controversial relationship between Nordstr6m and Einstein, see the letters by P. Freund and E. L. 
Schucking in Physics Today, August 2009, p. 8. 

6. Incidentally, Kraichnan did his work as an 18-year-old undergraduate at the Massachusetts Institute of 
Technology. As a postdoc at the Institute for Advanced Study, he showed his work to Einstein, who was 
appalled by this so-called “particle physics” approach, in contrast with the geometrical approach. He delayed 
publication for 8 years and ended up publishing after Gupta. Perhaps partly as a result of this encounter with 
Einstein, Kraichnan left the field and later became an eminent authority on turbulence. 

7. Feynman’s work came out of his effort to quantize gravity. In one story, when Feynman told his colleague 

Murray Gell-Mann about his research, the latter told him to try quantizing Yang-Mills theory as a warm-up 
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11. 


exercise. Feynman wrote his wife Gweneth a famous letter from the Warsaw conference on gravity in 1962, 
in which he said, “Remind me not to come to any more gravity conferences.” Also, “Now I will show that 
I too can write equations that nobody can understand.” These two remarks more or less summed up the 
attitude of particle theorists to Einstein gravity until the mid-1970s. 


. In this particle physics approach, as championed by Feynman and Weinberg, curved spacetime and Rieman- 


nian geometry are not put in but rather fall out. See the preface and introduction to S. Weinberg, Gravitation 
and Cosmology, and R. P. Feynman et al., Feynman Lectures on Gravitation. 


. See QFT Nut, chapter VIIL1. 
10. 


The Yang-Mills field is intermediate between the electromagnetic and the gravitational fields in complexity. 
The analog of the infinite series we encounter here terminates in Yang-Mills theory. See, for example, QFT 
Nut, chapter IV.5. 

For more details, see QFT Nut, pp. 38 and 188. 


lsometry, Killing Vector Fields, 
and Maximally Symmetric Spaces 


Why do we love the sphere so much? 


Think about the vast amount of theoretical physics you have learned and you realize the 
enormous role spheres and other symmetrical situations have played in enhancing your 
understanding. Even with the ubiquity of numerical computation these days, analytically 
soluble examples still provide valuable, perhaps indispensable, railings for us to hold on 
to. So too in Einstein gravity: the most intensely studied spacetimes, as you might expect, 
are the most symmetrical. In this chapter, we explore symmetry in the context of space 
and spacetime. 

We love the sphere, obviously because its high degree of symmetry makes it easy to 
work with. Indeed, every point on the sphere is equivalent to every other point. But 
somebody could have given you the metric on the sphere in some awful and unfamiliar 
coordinates, and you may not recognize that it describes the sphere. In fact, were the 
metric ds* = dé? + sin? @dy* unfamiliar to you, how would you go about discovering that 
it possesses the maximal amount of symmetry? The two coordinates 6 and ¢ are treated so 
differently. Thus, we definitely need to develop some machinery to uncover any symmetry 
that might be masked by a poor choice of coordinates. 


The isometry condition 


In Riemannian geometry, because of the freedom in choosing coordinates, symmetry is 
not always glaringly advertised. 

Ifthe geometry at point P and the geometry at Q are the same, the two points P and Q are 
said to be isometric. The metric for the sphere in the standard coordinates is independent 
of g, so obviously, two points related by y’ = g + c for an arbitrary constant c are isometric. 
But the isometry in the @ direction is not so evident, and the isometry in an arbitrary 
direction is even less so. 
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As always, 87, (%") = 8yv(*) x, ox. Now suppose we require that the metrics g’ and g, 
as functions of their respective arguments, are the same, namely that g/,, (x') = Spo (x’). 
Thus, the question of whether the space enjoys any isometry amounts to asking whether 
the set of equations 


ax" ax” 


ax!P ax!/? 


Bp0(X') = Buy (X) (1) 


has any solutions. 


Watch the primes like a hawk here! Note carefully that this condition of isometry differs 
Ox Ox? 

Ox’/P Ox!7? 
which we just cited and which you first encountered in chapter I.5 telling us how the metric 


by a single prime from the far more commonly seen equation ae (P= 85) 


g’ is determined by the metric g. (Indeed, the right hand side in (1) is none other than 
8,9 (X')-) In contrast, the isometry condition (1) compares a given metric g,,, evaluated at 
x and x’: it imposes a requirement on g,,, that most metrics in fact fail to satisfy. 

In general, the condition (1) consists of a set of formidable equations that are difficult 
to solve, but mathematicians have studied them in depth. We are content, however, 
to follow the German minister and mathematician Wilhelm Killing (1847-1923) and 
analyze (1) when the two points are related infinitesimally x/“ = x“ + e€"(x), in the same 
spirit we adopted when studying Lie algebra. Indeed, Killing also discovered Lie algebras 
independently of Sophus Lie and even anticipated some later developments by Elie Cartan. 
Lie, however, bitterly disputed Killing’s claim to Lie algebras. 


So, let’s expand (1) out to linear order in the small parameter «. Setting aa = 65 - 


BO ne () O(¢) and collecting terms of order ¢, we find 
Spo dpe" + Spvdge” + Ed, Boe =0 (2) 


Indeed, if we write the isometry condition (1) as 0=g,,(x’) — ie (’) = (895 (x') — 
Splicr (ec e = Sis (x’)), the expression in the first parentheses provides the £dg 
term in (2), while the expression in the second parentheses gives the gdé terms. Note also 
that we have already encountered this combination in chapter V1.5. 

Using the definition of the covariant derivative, we could write (2) more compactly as 


oo git Core =0 (3) 


Here we use the semicolon notation introduced in chapter V.6. 

One potential source of confusion for the beginner is that the isometry condition, (2) 
or (3), can be looked at in two different ways. Given a metric, we could solve the isometry 
condition for the vector €. Alternatively, we could be given a bunch of és and ask how these 
isometries restrict the metric. 

A vector field €(x) satisfying (2) or (3)—the two conditions are equivalent—is called 
a Killing vector field. I will often be sloppy and omit the word “field” when it is clearly 
understood from the context that € depends on x. 

Indeed, you might recall that you already encountered this term in chapter V.4. There 
we learned that the metric for a general static isotropic spacetime has the two Killing 
vectors € = (1,0, 0,0) and € = (0, 0, 0, 1). Referring to (2), which in this case reduces 
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to €*0; 85 = 0, we see that speaking of Killing vectors is just an extra fancy way of saying 
that the static isotropic metric in chapter V.4 does not depend on the coordinates ¢ and ¢. 

But already back in chapter V.4, you might have seen that there is a theoretical issue. 
The Killing vectors were obvious, because we chose a nice symmetric form for the metric, 
but in principle, as already mentioned, the metric could have been presented to us in some 
poorly chosen coordinates. The condition (2) or (3) tells you how to find the Killing vectors. 
However, by now, you realize that usually we start out with the isometries we want, which 
then guide us to the form of the metric, rather than the other way around of having to find 
the isometries for a given metric. 

For further theoretical analysis, (3) is more suitable, but in an actual search for Killing 
vectors, (2) is simpler. By the way, we know that the ordinary derivatives in (2) must 
metamorphose into covariant derivatives as in (3), since the existence of Killing vectors 
is a coordinate independent statement, and so the existence condition must transform 
properly. 

Those readers with a long memory might make a connection with the concept of Lie 
derivative introduced in chapter V.6. Indeed, the condition (2) had already appeared in 
appendix 2 of chapter V.6, and, rather nicely, asserts that the Lie derivative of the metric in 
the direction of the Killing vector € vanishes: 


Le8yy =0 (4) 


Killing vectors 


In general, (2) may have a number of solutions; we then label the various Killing vectors 
ba by an index a. Obviously, any linear combination of Killing vectors )°, Cakian isalsoa 
Killing vector. That (3) is a tensor equation ensures that the number of linearly indepen- 
dent Killing vector fields does not depend on the coordinates used, evidently, since isometry 
reflects an intrinsic property of the space. 

As explained in chapter V.4, with any vector V“, we can associate the differential 
operators V“9,,. Indeed, in more advanced treatments, Killing vectors are thought of as 
differential operators, namely as €(4) = bay Ou: Thus, given a set of Killing vectors, we can 
study the isometry algebra generated by commuting the €(,)s with one another. You can 
see how close this is to Lie’s idea. See appendix 5. No wonder there was controversy over 
priority. 

Let us try out our formalism on a laughably simple case, Euclidean 3-space E*. Then 
(2) simplifies to the 6 equations 0,¢* =0, 0,€” + 0,€* = 0, and so forth. These are easily 
solved. For example, acting with d, on the second of the preceding equations and using 
the first equation, we obtain 92é = 0, showing that the y-component of the Killing vector 
€ can depend on x and z linearly but not on y. We find 6 Killing vectors: €(1) = (1, 0, 0), 
Ea) = (0, 1, 0), (3) = ©, 0, 1), &4) = Cy, —x, 0), Es) = ©, z, —y), and &6) = (—z, 0, x). As 
we would expect, the isometry algebra consists of the usual Euclidean algebra generating 
translations and rotations. For example, an 0, = yz —x i generates rotations about the 
z-axis, as discussed in detail in chapter 1.3. 
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A slightly more involved case is the familiar S*. Write out (2) with gog = 1, 809 = sin? 0: 

f° =0, 2sin?9a,€% + EF, sin?9=0, a,¢° + sin* 6da,é" =0 (5) 
These are easily solved to give (writing the Killing vectors conveniently as differential 
operators €/,3,,) 


pe Seay a0 0) > oy ne a ee _ 2 
Saye BENG age et en S(2) = COS Pag — cot@ sin yze, 63) = 39 (6) 


As I said at the start of this chapter, you could have guessed €(3) easily, but €(1) and €(g) are 
less obvious. These generators should look familiar to the reader who has taken courses 
on electromagnetism and quantum mechanics. At any given point on the sphere, we can 
translate in two directions and rotate around one axis, hence the 3 Killing vectors. 


How many Killing vectors can we have? 


From these two elementary examples, it is clear that a D-dimensional Riemannian man- 
ifold can have at most 5 D(D + 1) Killing vectors, corresponding locally to D translations 
and ;D(D — 1) rotations. In both examples, this maximum number is attained, 6 in the 
case of E?, and 3 in the case of $?. 

The Smart Experimentalist exclaims: “Right! Forget about fancy proofs: D-dimensional 
Euclidean space has this many Killing vectors; how could a space possibly be more 
symmetric?” Go ahead, be my guest, prove it if you must. 

A D-dimensional Riemannian manifold with }D(D + 1) Killing vectors is said to be 
maximally symmetric. 

Some definitions. A space is homogeneous if there exist translational Killing vectors to 
take any point to any other point in its vicinity. A space is isotropic at a given point X if 
there exist rotational Killing vectors that leave the point X fixed, that is, €“(X) = 0, whose 
derivatives €,.,(X) span the basis of D-by-D antisymmetric matrices. This merely means 
that every possible rotation about X is an isometry, as befits our intuitive definition of 
isotropy. (To see this, regard the derivatives €,.,(X) as matrices with indices op. Note that 
(3), that is, €,., = —€,.,, implies that these matrices are antisymmetric. The statement 
that the space is isotropic means that the generator of any rotation about X can be written 
as a linear combination of €,., evaluated at X.) A homogeneous space isotropic about 
some point is obviously isotropic about all points and so is maximally symmetric. All these 
statements are fairly straightforward to prove. 

For definiteness, we talk about D-dimensional spaces, but clearly, everything we say 
here can be applied to spacetimes. 


Maximally symmetric spaces 


Indeed, if we are told that the space is maximally symmetric, we have 5D(D + 1) con- 
straints on the Riemann curvature tensor, enough to fix it uniquely. We will prove this in 
appendix 4, but for now, let us take the lazy man’s way and try to wing it. 
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A maximally symmetric space is utterly featureless, so to speak. Every point, every 
direction, looks the same. So, what could the Riemann curvature tensor be? There is only 
the metric tensor kicking around. 

Confusio asks: “What about the derivative of the metric tensor?” 

This is to be an equality between tensors, so, Confusio, we must have the covariant 
derivative, not the ordinary derivative. But g,,,., = 0. So, only the metric tensor is available. 

To construct the curvature tensor, which carries 4 indices, we need something like 
8rp8pv- laking into account the many symmetry properties of the curvature tensor, we 
see that it must be given by 


Repu = K (8rp8pv — Srv8pp) (7) 
with K some constant. We will prove in appendix 4 that this is correct. Later, in chap- 
ters IX.7 and IX.8, we will see that, in the language of differential forms, the Riemann 
curvature looks even simpler. 

Summing over pairs of indices, we have 

Rpy =(D—- IK py 

R=D(D-1K (8) 
Maximally symmetric spaces have constant scalar curvature. This holds for the two almost 
trivial examples we know, Euclidean spaces and spheres. 


Killing vectors and conservation laws 


Physicists love isometries for another reason: Killing vectors are associated with conserva- 
tion laws, a fact we have exploited and remarked upon several times, notably in chapter V.4. 
We can now easily prove this connection between symmetry and conservation, which goes 
way back to Noether’s theorem in chapter II.4. Consider a geodesic described by X"(t), 
with the tangent or velocity vector V“(t) = ax" Let E(x) be a Killing vector field of the 
spacetime. Then €,(X“)V“ is conserved along the geodesic. 

To prove this, act with the covariant derivative along the geodesic (recall appendix 1 of 
chapter V.6) on the alleged conserved quantity: 


v"D, (¢,.V") = Ae da rer + on (V’D,V") a Ae Aa air =0 (9) 


We used the product rule for the first equality and the geodesic equation for the second 
equality. The isometry condition (3) tells us that €,,., is antisymmetric in jzv, hence the 
last equality. 


What isometry is all about: Would the view stay the same? 


I conclude by giving you an intuitive account of what isometry is all about. If you find 
yourself in an unfamiliar landscape, you might want to ask yourself, “If I move a bit in 
that direction, would the view stay the same? If I turn around a bit, would the view stay 
the same?” To find out about the landscape, you look around. If the view stays the same 
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as you turn around, the space is isotropic. If the view stays the same as you move without 
turning, the space is homogeneous. 

Hiking in a maximally symmetric space would be kind of boring. But in fact, the spatial 
part of our universe appears to be maximally symmetric, as we saw in chapters V.3 and 
VIII.1. Indeed, the future of our universe might approximate a maximally symmetric 
spacetime. Or perhaps maximally symmetric spaces, like the sphere (and the de Sitter and 
anti de Sitter spacetimes we will study in chapters IX.10 and IX.11, respectively), represent 
what we can handle analytically. 


A quick summary of this chapter. You are given a metric g,,,(x). The prickly issue here 
is the freedom to change coordinates, so that the intrinsic qualities of the space may be 
totally obscured in the metric, which may look like a mess merely because of a poor choice 
of coordinates. The isometry condition (2) or (3) tells you the different ways you can move 
or turn without changing the view. Each of these ways is associated with a Killing vector, 
the number of which is limited by the dimension of the space. If we have the maximum 
number of possible Killing vectors, then, as you might expect, the space is pretty much 
featureless, so that the Riemann curvature tensor is completely fixed up to an overall 
constant K, and all curvature invariants, such as the scalar curvature R or the square 


of the curvature tensor R,,,,,R"°"”, are all constants given in terms of K. 


TOULV 
Appendix 1: Coset manifolds 


This might be a good place to mention the concept of coset manifolds. Start with a Lie group G and a subgroup 
H of G. Let us say that two group elements g, and g) are equivalent if there exists an element h of H such that 
81 = 8h. Equivalence classes are then defined in the usual way: g; and g belong to the same equivalence class 
if they are equivalent. The next step is to define a space or manifold by associating each equivalence class with a 
point. The resulting manifold is known as the coset manifold G/H. 

As an example, the coset manifold SO (3)/SO(2) is the familiar sphere S*. Why? Let us go slow here. Every 
point P on the sphere S? is uniquely associated with a unit vector @ pointing from the center of the sphere to P. 
Denote by Z the unit vector pointing to the north pole, that is, the unit vector pointing in the z direction. Our first 
thought might be to associate the point P with the rotation g (that is, an element of G = SO(3)) that rotates 2 
into 7. The problem is that the rotation g is not uniquely determined. Denote by H = SO(2) the subgroup of G 
consisting of all rotations about the z-axis. Then two rotations g; and g related by g; = gyh, with h an arbitrary 
element of H, would both rotate Z into a. In other words, # = 912 = g)hZ = g>z. Thus, the point P is not to be 
associated with the element g,, but with the entire equivalence class g, belongs to. In other words, P does not 
specify uniquely the rotation that would take z into the direction vector i associated with P. 

Incidentally, the notation G/H is apt; we consider the elements g of G but with / factored out, so to speak. 

In our example, we need 3 parameters to specify an SO(3) rotation, and 1 parameter to specify an SO(2) 
rotation. Hence, we need 3 — 1= 2 parameters to specify the equivalence classes and hence the points P in G/H. 
Indeed, S? is 2-dimensional. In general, the dimension! of G/H is equal to the n(G) — n(#), with n(G) and 
n(H) the number of generators for the groups G and H, respectively. 

This discussion can be immediately extended: for example, SO (4)/SO (3) = S?. More generally, the sphere S¢ 
can be identified as the coset manifold SO(d + 1)/SO(d) with dimension 3(d + 1)d—- 3d(d — 1) =d. We will 
come across this coset construction again when we discuss de Sitter and anti de Sitter spacetimes in chapters IX.10 
and IX.11. 


Appendix 2: Hyperbolic spaces as coset manifolds 


We first encountered the hyperbolic spaces H” in chapter I.6. Given what you just learned about the spheres 
S", perhaps it is not a huge surprise that the hyperbolic spaces are also coset manifolds: H” = SO(n, 1)/SO(n). 
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Let us verify this explicitly for n = 2. The group G = SO(2, 1) is the Lorentz group for (2 + 1)-dimensional 
Minkowski spacetime. The role of the north pole is played by f = (1, 0, 0) (the column vector written as a row 
vector for typographical reasons), left invariant by the subgroup H = SO(2) consisting of rotations in the (2 + 1)- 
dimensional Minkowski spacetime. Thus, the coset manifold is generated by acting with boosts on tf; in other 
words, using the notation of appendix 3 of chapter I.6, we have (W, X, Y) = (cosh y, sinh w cos 6, sinh y sin @). 
Then ds? = dX? + dY? — dW? = dy? + sinh? yd? in agreement with what we had in chapter 1.6. 


Appendix 3: From Killing vectors to Lie algebra 


In chapter V.6, we introduced the commutator between two vector fields and also the Lie derivative. Let us apply 
what we learned there to the Killing vectors. Take the Killing vectors for the sphere given in (6) and calculate 
their commutators. For example, [&3), €4)] = % , sing ze + cot @ cos ral =). You can verify that in fact, 
[E@) &)] = €abe&(c) With €,,- the antisymmetric symbol defined by €,.; = 1. No wonder Lie was upset about 
Killing. 

On a coset manifold, the isometry group is evidently just G. Then the discussion above goes through with 
€abe replaced by the structure constants f,,, of the Lie algebra of the group G, not surprisingly. In other words, 
on the coset manifold G/H, the Killing vectors satisfy 


[Eas a = fabe§(c) (10) 


The reader who did not quite follow this need not worry; we will only use this fact in passing in chapter X.1. 

An important special case occurs when H is the trivial subgroup consisting of only the identity element. Then 
G/H is just the group manifold of G: g, and g, are equivalent only if they are the same element. Each point on 
the group manifold corresponds to a distinct element of G. 


Appendix 4: Constraints on the Riemann curvature tensor 


We are now going to do what we postponed doing in the text, namely analyze the conditions (2) and (3) in detail. 
If you are seeing this for the first time, the analysis might seem a bit involved, bristling with indices. It is okay 
to skip this appendix. 

For any vector V, you derived in exercise VI.1.4 that V, 


i 
Vosuiv — Vowin + Vuroin — Vosuse 4 Vusvip Vizp;v = 0. 


For V, we now take a Killing vector. Since the Killing vector obeys (3), Cos = Susp» the 6 terms in the identity 
reduce to 3 terms: 

Spsniv — Sp:vin — Sv;n30 = 9 (11) 
Using the defining expression for the curvature tensor Vp...) — Vo:y.n = = — RE yVor We then obtain 

Suspsv = —R ouSo (12) 


Suppose we know the curvature tensor. Then, given a Killing vector ¢,(X) and its first derivative € piu(X) = 
—Ey;)(X) at some point X, we know its second derivative, thanks to (12), and, by repeatedly differentiating 
(12), all of its higher derivatives. (We are of course talking about covariant derivatives.) Hence, we can construct 
&,(x) as a Taylor series in (x” — X"). The result, &(q)p(x), for some specific a, evidently depends linearly on 
the D 4 ;D(D 1) = 4D(D + 1) initial values E(ayp(X) and (q)p;,(X). It follows that the number of linearly 
independent €(,),(x) cannot exceed 5D(D + 1). The maximum number of Killing vectors is equal to 3, 6, 10, for 
D =2, 3,4, respectively, as is clear intuitively. 


In exercise VI.1.5, you derived Typ: — Typ:w;y = —(R “wel op + Rove! ne 
this to €,,.5: 


) for any tensor Tp. Now apply 


+ R° 


Suspsvio — Suspsa;v = — (x ad: p iene) (13) 


But we can also compute the left hand side of (13) by differentiating (12): €,:p:y:0 = —(R° vpiiaoo TR pasa; a): 
Plug this into (13), and we get a longish equation 


Riser p PR ihre = Rocca PR ikem) —  Rauaste eaten) (14) 


involving the curvature tensor and its derivatives, and the Killing vector and its derivatives. 
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Let’s step back for a moment and think how best to organize the mess. We will collect all the terms involving 
é, on one side and all those involving €,., on the other: 


(Repu — yb 2 So = ~ Ri voso: p + R vaso R puso: ot R puso 
= (Riad R vad Rondo Re) So; K (15) 


This way of collecting terms is possible, because € is not just any garden variety vector, but the very special 
Killing vector. In particular, due to (3), we can convert é,,., in (13) to —G,,,,. Incidentally, equations of this type in 
differential geometry are in fact much less forbidding than they look if we remain cognizant of various symmetry 
properties of the expressions involved; for example, that (13) is antisymmetric in (v  w). 

We study (15) in the following appendix. 


Appendix 5: Maximal symmetry fixes the curvature tensor 


We can exploit (15) by using our knowledge of the Killing vectors to place a powerful constraint on the curvature 
tensor. Note that for each Killing vector we know about, we have one constraint on the curvature tensor. 
At a given point, we can take € to be a translation-type Killing vector, for which €,., vanishes at that point. 
Since we have D linearly independent és, the coefficient of €, in (15) must vanish for each value of o: 
R= Re, (16) 
plsv velo 
If we take € to be a rotation-type Killing vector, then €, vanishes at that point, and thus the left hand side of (15) 
vanishes. Since by definition, &,., regarded as a matrix spans the basis of D-by-D antisymmetric matrices, the 
vanishing of the right hand side of (15) forces the coefficient of €,.,. to be symmetric under (0 = x): 


Ro 8& — Ro §€& — Ro SB 4 Ro BK = RE 8° — RE §9 _ RE BF 4 RE 5° (17) 
Mvo p pro vp @ ope” v bo” p on vp @ opp v 


We now have all the information we need from (17). Contract « with p and obtain 


R°,,.D— R° R° + R° | = R° Rye, + Rod, 


vo a) vow Ove Vo O-'v 
where we used Re ve = 9. Invoking the cyclic identity in exercise VI.1.3, we obtain (D — 1)R%,,,,. = Rody 
Roden or, after lowering indices, (D — ) Ro pvw = Rop8ov — Rvp&ow- Contracting with g°, we find that R,, = 


BR8ov- Inserting back, we learn that the Riemann curvature tensor is given by Repo = DBD BopSov - 
8v8ow)- We are almost there: we will have (7) if we can show that R is necessarily a constant. Intuitively, that is 
more or less obvious, because a maximally symmetric space is featureless (every other point is as good as every 
other point), and R is a scalar independent of coordinate choice. 

We have one more card up our sleeves, the Bianchi identity (R°” — 4 7Rg°")., = 0. Inserting R,,, = 7 pR8ow 
we obtain es _ 50, R=0. Thus, for D £ 2, the scalar curvature R is a constant, which we write conventionally 
as R = D(D — 1)K, and we are done. 

The special case D = 2 is easily dispatched, since the curvature tensor then has only one component, R412. 
Indeed, according to exercise VI.1.6, regardless of whether the space is maximally symmetric, the Riemann 
curvature tensor always has the form Ry jy = R(gou Sov — &vp8ow). But for a maximally symmetric space, we 
have another equation up our sleeves we haven't used, namely (16). Plugging the form of the curvature tensor 
into (16), we find that R is also constant for a D = 2 maximally symmetric space. 

So, for a maximally symmetric space of any dimension, we have, as we had more or less guessed, the highly 
appealing result Rep = K(8rp8pv _ 8rv8pu)- 


Appendix 6: Form invariant tensors 


We can apply a condition analogous to (1) to tensors other than the metric tensor. We say that a tensor T,,,...,. is 
form invariant if 
ax" Ox” ax® 


ax? ox? ax't 


po veer) = Typ .neeg(X) 
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The infinitesimal form reads 
Tyg et pb + Tyyeetdg &” $01 + C8, Tyo ver = 0 (19) 


This has the same form as (2): a sum of terms involving the derivative of €° is equal to a term involving the 
derivative of T..... With this definition, the existence of isometry amounts to requiring that the metric be form 
invariant. 

The concept of a form invariant tensor makes sense regardless of whether the space itself is maximally 
symmetric. If the space is maximally symmetric, a tensor that satisfies (19) for all 3D(D + 1) Killing vectors 
is said to be maximally form invariant. This is clearly quite restrictive; see exercise 3. 


Exercises 


1 Solve for the Killing vectors on the sphere. 
2 Evaluate the 3 Killing vectors é in (6) at various points on the sphere. 


3 Establish the following facts for a maximally symmetric space. (a) A maximally form invariant scalar is 
necessarily constant. (b) There is no maximally form invariant vector for D # 1. (c) For D 4 2, a maximally 
form invariant 2-indexed tensor 7,,, must be equal to a constant times the metric tensor g,,,, as you might 
expect. Intuitively, a maximally symmetric space is completely featureless (think of the familiar sphere). 
So what can the maximally form invariant vector possibly depend on? Answer the same question for the 
maximally form invariant 2-indexed tensor. 


Note 


1. Coset manifolds also enter into the concept of spontaneous symmetry breaking in quantum field theory, and 
the dimension of G/H has to do with the number of Nambu-Goldstone bosons. See QFT Nut, p. 229. 


IX. 7 Differential Forms and Vielbein 


Many legs 


I now teach you a fancier, but in fact easier, method of calculating curvature than the 
traditional method given in chapter VI.1. Namely, I will tell you about vielbein (Ger- 
man! for “many legs”: dreibein = three legs, vierbein* = four legs, and so on) and 
differential forms, which definitely do not belong to the “fancy but useless” category 
in which I file away many things at the more mathematical end of theoretical physics. 
Indeed, both the vielbein and differential forms have their origins in humble physical 
considerations. 

At this point, Professor Flat ambles by again, mumbling, “Even though the world is 
round, locally we can still erect orthonormal frames* of reference, Descartes’s good old x-, 
y-, and z-axes, now called legs.” 

You and I respond, “Yes, professor, we have learned that Riemannian manifolds are 
locally flat. Everyday life is flat.” 

The idea is to write the metric as 


Buv(X) = Napen (ep (x) (1) 


As Professor Flat explained in chapter 1.6 (in our demonstration that we can always choose 
locally flat coordinates at a given point x), in the first step we turn g,,,(x) into nypg by 
a similarity transformation. In (1), we have merely denoted the matrix that appears in 
the similarity transformation by e% (x). 

The metrics we commonly encounter are so simple that we can even write down ef (x) by 
inspection. I hasten to give an example: the familiar 2-sphere with ds? = d6? + sin? 6dg? 
(that is, with g99 = 1, g99 = sin’ 6), from which we read off e, = land oo = sin 6 (all other 
components are zero). In (1), the indices a and £, called Lorentz indices, take on d values 


* Indeed, we have already encountered orthonormal frames when we discussed Fermi normal coordinates in 
chapter IX.3. 
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aan 


Figure 1 Running around erecting orthonormal frames. 


in d-dimensional spacetime. The usual indices jz, v are called world indices to distinguish 
them from the Lorentz indices. Even though the two kinds of indices are conceptually quite 
different (see below), we can still think of the square array e(x) numerically as a square 
matrix. Then we can write (1) as a matrix equation g = e” ne, and, as was just mentioned, 
think of e as a similarity transformation that diagonalizes g,,,, and scales it to the unit 
matrix. 

Since we are physicists, most of the manifolds we deal with will be locally Minkowskian, 
and hence the Minkowski metric ng appears here. If the manifold is locally Euclidean, I 
should have written g;; = 5,,e¢ a , but Iam not that fussy. For both cases, I will use Greek 
rather than Latin letters for the indices, and trust you, when appropriate, to replace the 
Minkowski metric by the Euclidean metric in your head. In either case, we picture a little 
man running around erecting orthonormal frames (dreibein, as shown in figure 1). If the 
metric g,,, describes the universe, all we are doing is setting up local Lorentz frames at 
each point in spacetime. 


Lorentz indices versus world indices 


Lorentz indices a, 6, - - - are contracted with the Minkowski metric n,g (or the Euclidean 


metric, as the case may be), which, consisting of 1s and 0s, is much easier to deal with 
than g,,,. (That is the point of the formalism!) In contrast, world indices ju, v, --- are 
contracted with the metric g,,,, as usual. As g,,,(x) varies from point to point, so does 
the vielbein a (x), as indicated in the figure. (Note that we use the Greek letters early 
in the alphabet for Lorentz indices and the Greek letters later in the alphabet for world 
indices.) 

The transformation of the metric 


ax ox? 
/ # 
Big) = Bw) ae (2) 
under a coordinate transformation translates into 
ox 
I I 
en (x = ef) a (3) 


Ox 


Verify this by calculating g} ,(x’) = nopey (xe wea 
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In a sense, the vielbein represents the square root* of the metric. Taking the determinant 
of (1), we have —g = e?, where e denotes the determinant of e”, tegarded as a matrix, and 
thus the pesky square root in the volume factor ./—g = e goes away. 

In taking an ordinary square root, we are free to take either the plus or minus sign. 
Similarly, we are free to Lorentz transform (rotate, in the Euclidean case) the vielbein: if 
you use the vielbein ef, | am free to use some other vielbein ef instead, as long as mine 
is related to yours by a Lorentz transformation 


et (x) = A% (x)ee (x) (4) 


Let us check that indeed g,,,,(x) = e (x) apes (x) = 2 (X) Napeh (x) = 8, yx)if AVA =I. 
Note that the transformation in (4), in contrast to that in (3), leaves x untouched. What 
we are discussing here is not a coordinate transformation, but the freedom to orient our 
vielbein whichever way we like. To emphasize this, I have used twiddle instead of prime. 
(Also, note that A here can depend on x, unlike the discussion in special relativity; in field 
theory, this is known as a local or gauge transformation.) 

Again thinking of e(x) as a square matrix, we can consider its inverse, which we write 
as e” (x). The standard result from linear algebra states that the left and right inverses of a 
nonsingular matrix are the same, and hence we have e” (x)eF (x)= 5P and e” (x)er (x)= ie 


For diagonal g,,,, ef (x) is also diagonal, and the inverse vielbein e/ (x) may be written down 


1 
sin 6 


by inspection. For example, for the 2-sphere, ef = anid es (all other components 
are zero). 

The transformation properties (3) and (4) of the vielbein show that it straddles the 
domain of the Lorentz indices and the domain of the world indices. The vielbein can be 
used to convert one type of index to the other type. For example, given a world vector 
J“(x), we can construct J*(x) = a (x) J“(x), which is in fact a Lorentz vector, as you 
are invited to verify. Under a local Lorentz transformation, J“ (x) transforms as a Lorentz 
vector, but under a coordinate transformation, it transforms as a world scalar. Similarly, 
given the Riemann curvature tensor ee we can form R’, op = Re, ees en ep: We can 
: w= oer ey . Again, keep in mind the 
distinction between early indices, such as y and 6, and late indices, such as px and v. 


of course also consider mixed objects, such as R 


Differential forms 


I now introduce the language of differential forms. Fear not, we will need only a few 
elementary concepts. Let x” be d real variables (thus, the index yz takes on d values) 
and A,,(x) be d functions of the xs. (In this purely mathematical section, we do not 
for the moment specify what A,, is.) We could discuss everything in the abstract like 
mathematicians, but in fact, in our applications, x“ represent coordinates, and as we will 
see, forms have natural geometric interpretations. 


* Readers of my field theory book will recognize that this is one of three ways of the warrior theorist. See “New 
Closing Words” in QFT Nut, p. 522. 
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We call the object A = A,,dx a 1-form. The differentials dx” are treated following 
Newton and Leibniz. If we change coordinates x > x’, then as usual, dx" = ox dx”. As 
we said in chapter 1.3, dx" is the ur-vector. (If you confuse the d here with the d in the 
preceding paragraph, you are in trouble!) 


The form A does not carry any indices and so does not transform. Indeed, we insert the 


transformation of dx“ just mentioned to obtain A = A,,dx" = A, ar are Ai dx”. The 
x 


last step defines A’ and thus reproduces the standard transformation law of vectors with a 
ax 

ax" 
The important point here is that the form A does not depend on any specific coordinate 


lower index under coordinate transformation: A’, = A it 


Hence A,,(x) is a vector field. 


system: it is coordinate free. But you are welcome to express it as a linear combination of 
dx* in a coordinate system of your choice. 

It was already explained in part I that A,, transforms like the dual ur-vector d,,. As an 
example, consider the 1-form A = cos 0 dg. Regarding @ and g as angular coordinates on 
the 2-sphere, we have Ay = 0 and A, = cos @. Note that the natural union of A,, and dx“, 
a marriage made in heaven so to speak, was already foreordained in chapter I.5. 

Similarly, we define a p-form as H = ai lineman ee --- dx», (Repeated indices 
are summed, as always.) The degenerate example is that of a 0-form, call it A, which is 
just a scalar function of the coordinates x“. An example of a 2-form is F = af ide ax”. 


We can add two p-forms together in the obvious way: 


1 
H+K= = (4 dxhidx'2 ... dx 
p! 


M12 p + Rigeca) 


In contrast, we cannot adda p-form toa q-form unless p = q. But we can naturally multiply 
any two forms together: for example, 


AF = (4,d2") (J Fndvtas") = SAF ydctdxtd® 
The product of a p-form and a q-form is evidently a (p + q)-form. In the example just 
given, the product of a 1-form and a 2-form is a 3-form. 


The reason for anticommuting 


We now face the question of how to think about the products of differentials dx“dx”. In an 
elementary course on calculus, we learned that dxdy represents the area of an infinitesimal 
rectangle with length dx and width dy. At that level, we more or less automatically regard 
dydx as the same as dxdy. The order of writing the differentials does not matter. 

Think, however, about making a coordinate transformation (x, y) > (x’, y’), with x’ = 
x'(x, y) and y’ = y'(x, y) two functions of the coordinates x and y. Now look at 


/ / / 0 / 
dx'dy' = (= Fea ay) (2 dx+— ay) (5) 
Ox dy Ox dy 


Notice that the coefficient of dxdy is oat = and that the coefficient of dydx is a oy . We see 


that things work out neatly if we treat the product between differentials as anticommuting, 
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so that dydx = —dxdy. Then the dxdy and dydx terms in (5) combine into 


fa) ‘9 / / # 
(== e =o) jen 
ax dy dy Ox 


and we recognize the expression in parentheses as the determinant of the matrix 


(eg) 

ay ay! 

x Dy 

namely, the Jacobian J(x’, y’;x, y) for the transformation (x, y) > (x’, y’). Furthermore, 
if the product of differentials anticommutes, we would have dxdx = —dxdx = 0, and 
similarly dydy = 0. Consequently, (5) simplifies nicely to 

dx'dy' = (ee - =o) dxdy = J(x', y'; x, y)dxdy (6) 
We obtain the correct Jacobian for transforming the area element dxdy to the area ele- 
ment dx’dy’. 

The product between differentials dx“dx" is known as the wedge product and is written 
as dx“ A dx” in many texts. We will omit the wedge—no need to clutter up the page, 
at least for our purposes. Alternatively, we can regard the differentials dx“ formally as 
anticommuting objects,‘ so that by definition, dx“dx” = —dx’dx". 

The little exercise given above motivates the natural emergence of anticommutation in 
this context. Otherwise, it would appear to be totally arbitrary. Our little exercise also makes 
clear the geometrical origin of the “extra” sign. The area element dxdy is directional: dxdy 
and dydx span the same area but point in opposite directions. You can see that this makes 
sense by recalling, for example, your first encounter with the notion of the divergence of 
some vector field, for example, the divergence of a current J(%). You were taught to think 
of an infinitesimal cube and multiply the current by the area element on each of the six 
faces of the cube. To obtain V - J, you clearly have to treat the area elements on opposite 
faces as pointing in opposite directions. 

The anticommuting property dx“dx" = —dx"dx" indicates that in d-dimensions, we 
can have p-forms only for p < d. 


The exterior derivative 
We now define a differential operation d (known as the exterior derivative) to act on any 
form. Acting on a p-form H, it gives by definition a (p + 1)-form 


dH= 8) Hysiigreru de aad” oe dx 
Pp: 


Thus, dA = a,Adx” and dA = 4,A,dx"dx" = 1(3,A, — 3,A,)dx"dx". 
We see that this mathematical formalism is almost tailor made to describe* 
electromagnetism. If we call A = A,,dx" the potential 1-form and think of A,, as the 


* For a simple formulation of Yang-Mills theory using differential forms, see QFT Nut. 
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electromagnetic potential, then F = dA is in fact the field 2-form. If we were to write F 
out in terms of its components F = nF ydx"dx”, then F,,, = 0,A,, — 9,,A, is indeed the 
electromagnetic field. We see that the exterior derivative is just the sophisticate’s name for 
what the common people would call the gradient or the curl. 

Note that x” is not a form, and dx" is not d acting on a form. 

Ifyou like, you can think of differential forms as “merely” an elegantly compact notation. 
The point is to think of physical objects like A and F as entities, without having to commit 
to any particular coordinate system. This is particularly convenient when you have to deal 
with objects more complicated than A and F, for example in string theory (see appendix 2). 
By using differential forms, we avoid drowning in a sea of indices. 

Consider the product of two 1-forms A and B. Now act with d on the 2-form 
AB=A,,B,dx"dx". We have 

d(AB) = 0,(A,,B,)dx*dx"dx" = ((8,A,,)B, + A,,(0,B,)) dx*dx"dx” 

= (0,A,,)dx*dx"B,dx” — A,,dx"(a,B,)dx*dx” = (dA)B — A(dB) 
The first equality comes from the definition of d; the second from the product rule in 
ordinary calculus; the third from writing out the previous expression and moving dx” past 
dx in the second term; and finally, the fourth from grouping everything back into the 
forms. The important point here is the minus sign that appeared due to our moving dx” 


past dx". 
This result generalizes readily. Let A be a p-form and B a q-form. Then we have 


d(AB) = (dA)B + (-1)? A(dB) (7) 


Evidently, the sign (—1)? appears because we moved dx” past dx/idx"2.-.dx"». (Note 
that g does not appear explicitly in (7).) 
An important identity is 


dd=0 (8) 


This says that acting with d on any form twice gives zero. This fundamental identity is easy 
Adx’dxHidxl2...dxtr = 
yeep ax dxdxiMdx'2 ... dx" =0, 


since dx*dx” = —dx’dx*, while Newton and Leibniz told us that 4,4, = 4,,0,. In particular, 


to prove by direct evaluation: ddH = 719, (8) Hy 


dF =ddA=0. Writing this out in components, you will recognize this as a standard 
identity (the Bianchi identity) in electromagnetism. 

Note also that the square of a p-form A satisfies A? = (—1)?" A. (Write A2= AA = 
Ai isn eax xt DAs yy Axx”? --+dx"» and mentally move the dx“s 
past all the dx”s. For each of them, we get a factor of (—1)?. Thus, the stated result follows.) 


Therefore, for p odd, A? =0. 


Relating connection 1-forms 


After this excursion into differential forms, I can at long last tell you Elie Cartan’s formu- 
lation of the differential geometry of Riemannian manifolds. The transformation law (3) 
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immediately suggests packaging the vielbein into a 1-form e* = el'dx" called, naturally, 
the vielbein 1-form. In our simple example, e! = d6 and e? = sin @dg. Carrying no world 
index, e® should transform like a scalar under coordinate transformation. We can readily 
check this: e'"(x’) = e/" (x’)dx’* = e% (x) the ax dx” = e% (x)dudx” = e*(x), 

We studied infinitesimal rotations in chapter I.3 and found that the generators are 


antisymmetric matrices. Similarly, we studied infinitesimal Lorentz transformations in 
chapter III.3 and found that the generators are also antisymmetric, except that we have 
to watch out for signs when raising and lowering indices. Recall that we considered (with 
some suitable changes in notation) an infinitesimal transformation A®, ~ 6%, + gw%,. To 
leading order in 9, the condition that A is in fact a Lorentz transformation reduces to (see 
(I1I.3.13)) o®snay + Nga, = 9, that is, wg + wg, =9. 

On a curved manifold, as we move from point x to a nearby point x + dx, we expect 
that the local frame will rotate or Lorentz transform, depending on whether the manifold 
is locally Euclidean or Minkowskian. In other words, an infinitesimal translation has the 
effect of rotating the form e (x) infinitesimally. Thus, the result of differentiating (that is, 
applying the exterior derivative d to) e“(x) should be given by 


de® = —w%,e" (9) 


for some antisymmetric w,g. (Here the minus sign is conventional and is part of the 
definition of w.) Since e is a 1-form, it follows that de is a 2-form and hence 


ow", = o's dx" (10) 


is also a 1-form, known as the connection 1-form: it connects nearby frames. Given e, (9) 
enables us to determine w. For the 2-sphere, de! = 0 and de? = cos 6d6dy, and so the 
connection has only one nonvanishing component: w!* = —w?! = — cos 6dg. 

But you and J are free to choose whatever vielbein we like, as was already mentioned 
in connection with (4). Suppose that at a given point, my vielbein e® is related to yours by 
e= A%e. (It is worth emphasizing that this is merely a local Lorentz transformation, 
or rotation if we are in space rather than spacetime, of our orthonormal frame, not a 
coordinate transformation. Indeed, you can see from (1) that the metric is not changed.) 
Our connection 1-forms w and @ better be related in such a way that (9) holds for both 
of us. 

Suppressing indices, we plug e = Aé into (9), de + we = 0, and plow ahead. Try doing 
it yourself. The first term becomes d(Aé) = Adé + (dA)é = —A@é + (dA) A~'e. Thus, 
requiring dé + wé = 0, we relate the two connection 1-forms by 


w= A@A~!- (dA)A7! (11) 
Notice that w does not transform as a Lorentz tensor due to the second term in (11). You 


should be reminded of the similar behavior of the Christoffel symbol under a coordinate 
transformation, as discussed in chapter V.6. 
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Cartan’s formulation of Riemannian manifolds 


The local curvature of the manifold is a measure of how the connection varies from point 
to point. Thus, we expect curvature to be given by something like R®, ~ dw%,, a 2-form. But 
we would like the curvature to transform nicely, as a Lorentz tensor (since R®, carries two 
Lorentz indices) under the local transformation A. But you can also see, just by looking at 
(11), that dw, is not going to transform nicely (indeed, see (12) below). 

Just about the only possibility is to add another 2-form to dw. We are severely limited in 
our choices, since we have available only the 1-forms e and w. Looking at how the different 
possibilities transform, we easily arrive at the correct choice, namely w multiplied by itself, 
that is, o. 

Confusio exclaims, “Wait! I thought that we showed that the square of a 1-form 
vanishes.” 

Ah, the point is that w is a matrix 1-form: ow", = wo", pax". Thus, 


oO of =o, of dxtdx’ = ; (0% wo —(LWe »)) dx"dx” 


‘p Bu yv Bu yv 

which has no particular reason to vanish. Another way of saying this is to think of w%, 
as d matrices @, with matrix indices wf. Then, suppressing the a and f indices, we can 
write what we just wrote as w* = Ou @,|dx"dx". There is no reason for the matrix 
commutator to vanish. Note that this is consistent with what Confusio said. Were @,, not 
a matrix, w would indeed vanish. 

The upshot of all this is that the desired curvature 2-form can only be the sum of dw 
and w* with some relative coefficient, which turns out to be 1 (see below). Thus, we obtain 
the curvature 2-form R = dw + w*. Restoring indices, we have R%, = do, + wo 0g. 

We now check that R = ARA~! does transform nicely. This is one of the most famous 
calculations in physics history; I will do it, but you should try to do it before reading on. 

Well, just plug (11) into R = dw + w? and plow ahead. First note that 0 = d(AA~}) = 
(dA)A~!+ AdA~, and so dA~! = —A~\(dA)A7. (You might recall that we have used 


this identity on more than one occasion, in (V.6.7), for example.) We have, using (11), 


dw = d(AdA!— (dA)A7) 
= (dA)@A~!+ A(do) A} — (-)A@A7“ (dA) Ag! — (ddA) A714 (dA)A7'dA)A7! (12) 


and 


w* = (A@A!— (dA) A7!)? 
= A@*A7!— AGA~"(dA)A7! — (dA)@A7! + (dA) A" A) AW! (13) 


The one tricky part of the calculation is the sign of the third and the fourth terms on the 
right hand side of (12). There are two extra minus signs, because we have to sail d past 
the 1-form @ to act on A~! in the third term, and to sail d past the 1-form dA to act on 
A~}in the fourth term, as explained in (7). Did you miss these two minus signs? Good for 
you if you didn’t. Also, note that the fourth term in (12) vanishes due to (8). Now add (12) 
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and (13). After much cancellation, we find R = dw + w* = ARA7!. Indeed, the 2-form R%, 
transforms like a Lorentz tensor. 

Written out in components, R%? = pRY dxtdx’. (Don’t forget the factor of 3!) As 
explained earlier, we can trade Lorentz indices for world indices and vice versa. I leave it to 
you to verify that easy see ef, is our beloved Riemann curvature tensor ie In particular, 
Ro eter is the scalar curvature. 

For the sphere, we have 


R®? = do" + 0’ w’* = dw” = sin 6dddy 
=ele= 5 (RM ,e1e + Ree") = Ri ,e'e (14) 


(The second equality holds because w!’ wv”? = ww? + w!2w* vanishes due to the anti- 


symmetry of w%?. The fifth equality comes from expanding the 2-form R!?.) Comparing 
ap 
ap = 
R12, + R2), = 2. Alternatively, we can trade Lorentz indices for world indices and write 


the final expression with e'e* gives R17, = 1 and thus the scalar curvature R = R 


12 ‘ 1 12 12 12 
RY = sin odbdy = 3 (Rij,dedy + Rijdgdd) = R'7,dbdp 


and so RG, = sin 0. This is of course consistent with RY, = RG ees =1, 
Thus, in Cartan’s formalism, Riemannian geometry can be elegantly summarized by 
the two statements* (suppressing Lorentz indices) 


de+we=0 (15) 
and 
R=dwot+o* (16) 


Putting the Cartan formalism to work 


In the following chapter, I will show you how to use the Cartan formalism to calculate 
curvature. In most cases, there is considerable reduction of labor. Also, I will postpone 
discussing the so-called Hodge star operation on differential forms until chapter X.5 for 
reasons that will become clear. 

Here I give an example of how easily we can derive some of the identities we already 
know. Apply d to (15) and obtain dde + (dw)e — wde = (dw)e — wde = 0, remembering 
that dd = 0 and that moving d past a p-form produces (as shown in (7)) a sign (—1)?. 
Use (15) to eliminate de, so that (dw)e + wwe = 0 = Re, where we recall Cartan’s second 
equation (16) in the last step. Putting back the Lorentz indices, we learn that 


Re? =0 (17) 


* Incidentally, one of the most appealing features of this discussion is that it brings out the profound 
connection between curvature as given in (16) and the field strength in nonabelian gauge theories F = dA + A?, 
with A a matrix 1-form. Because of this correspondence between w and A, the latter is sometimes called the 
gauge connection. See QFT Nut. 
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or, more explicitly, RY ehdxldx"dx* = RS pydxtdx’dx* = ERS ,ydxtdx'dx" = 0, that 
is, R®,,,dx#dx’dx* = 0. Since dx"dx"dx* = dx*dx*dx" = dx*dx"dx" (since at each step 
we are “moving a dx past two dxs”), we can write the preceding as RY, (dxtdx"dx* + 
dx”dx*dx" + dx*dx"dx”) = 0. But by renaming the dummy indices we are summing 


over, we can also write this as RE ip + Be ia + RY, Jdx¥dx"dx* = 0. We recover the 


cyclic identity R% + R%,,, + R%,,, = 0 found in chapter VI.1. 


Appendix 1: Connecting the two connections 


The vielbein eff and its inverse e“ allow us to freely convert Lorentz indices to world indices and vice versa. Given 
an arbitrary vector V", V® =e" Vi" is evidently a Lorentz vector and world scalar. In general, given an arbitrary 


tensor, say T“¥S , we can construct objects with various mixed tensor structures by contracting with the vielbein 
and its inverse in appropriate combinations, such as re ie a een ee THs 

The covariant derivative D,V" = 0,V“ +1},,V” of an arbitrary vector V“ transforms, as you know well by 
now, as a world tensor and a Lorentz scalar. The Christoffel connection adjusts for the fact that the coordinate 
transformation varies from place to place. In contrast, the covariant derivative? D, V% should transform like a 
world vector with a lower index and a Lorentz vector with an upper index. The ordinary derivative 0, V® transforms 
like a world vector, as it should, but not as a Lorentz vector under the location dependent Lorentz transformation 
A(x). Instead, it turns out that we should write 


D,V% = 4,V% + 0%, VF (18) 


As usual, when we transform, we have to compensate for the effect of 0, acting on A(x) in the first term 
by introducing the spin connection %,,. Suppressing Lorentz indices and transforming V = AV, we have 
DV =3,V +0,V =8,(AV) +@,AV = A(,V) + @, AV + (0,A)V under the transformation. Requiring that 
this be equal to AD, V = A(0,V + @,V) gives us the transformation @ = A~'wA + A~1(dA), familiar to you 
by now from (11). 

Consistency relates the Christoffel connection I and the spin connection w. Since D, V“ is a world tensor, we 


must have D, V% = ef D, V". The left hand side is equal to D, (eave) = 0, (eave) + w%, 08 V". Equating this 


to the right hand side e° (a, vey rv’), we obtain, after collecting terms and renaming indices, the rather 


satisfying and perhaps expected result 


a,er + wos eb = Pe =0 (19) 


This relation tells us that as we move from x to x + dx, the d vectors e& “rotate” a bit in the Lorentz index w and 


a bit in the world index v, each effected by the corresponding connection, w and I, respectively. It may also be 
rewritten as 


D,e% = —w%, 08 (20) 


telling us that covariant differentiation on the vielbein generates an infinitesimal rotation of the local frame. 

Recall that the metric is just the Lorentz dot product of two vielbein g,,, = e%7,,,¢; , and so we can immediately 
conclude that the covariant derivative of the metric vanishes and thus recover (V.6.15). Applying D,, on the metric 
and using the antisymmetry of the connection ,g,, = —@g,,,, we obtain, as expected, D,,g,, = 0, or in other 
words, O80, =Vasuy + Py.pa (where Dy.» = 8,p)1%, and the dot separates the two groups of indices in keeping 
with the notation of previous chapters). Recall from chapter II.2 that this equality amounts to the definition of 
the Christoffel symbol. 

The relation (19) also leads immediately to Cartan’s first equation (15): de + we = 0. We simply have to 


compute: de + we = (2,68 + wt, 06 )dxltdx” = Teaetdn: = 0, since the Christoffel symbol is symmetric 
in its two lower indices. 
We can also use (19) to determine one connection in terms of the other: 


r Xr 
Lei =e; (a,¢¢ + web) (21) 


604 | IX. Aspects of Gravity 


~ =—en (a et — 7% e*) (22) 


We can write (22) more compactly as w%, =I"4, + H% by defining the Christoffel 1-form I“, = efenr pdx" and 
the 1-form HY, = — en 0,endx". 

I leave to you to check that Cartan’s second equation (16) also follows immediately. 

For any curved spacetime, the symmetry group of the tangent space is always the Lorentz group. In other 
words, as explained earlier, the index a, by definition, responds to Lorentz transformation. The isometry group 
is of course another story entirely, and in fact may well be null. The confusion some students have stems from 
the fact that, for flat spacetime, the isometry group and the tangent space group are the same, namely the Lorentz 


group. 


Appendix 2: Exact is closed, but closed is not necessarily globally exact 


Here I mention an important concept somewhat outside the narrative of this book, involving differential forms. 
It is convenient to introduce some jargon. A p-form a is said to be closed if da = 0. It is said to be exact if there 
exists a (p — 1)-form £ such that w = dp. 

Talking the talk, we say that (8) tells us that exact forms are closed. 

Is the converse of (8) true? Kind of. The Poincaré lemma states that a closed form is locally exact.° In other 
words, if dH = 0 with H some p-form, then locally (that is, within some coordinate patch) 


H=dK (23) 


for some (p — 1)-form K. However, it may or may not be the case that H =dK globally, that is, everywhere. 
Actually, whether you knew it or not, you are probably already familiar with the Poincaré lemma. For example, 
surely you learned somewhere that if the curl of a vector field vanishes, the vector field is locally the gradient of 
some scalar field. 

Forms are ready made to be integrated over. For example, given the 2-form F = af, pvdx"dx”, we can write 
Jy F for any 2-manifold M. Note the measure is already included and there is no need to specify a coordinate 
choice. Again, whether you knew it or not, you are already familiar with the important theorem* 


/ dH=| H (24) 
M aM 
with H a p-form and 0M the boundary of a (p + 1)-dimensional manifold M. 

Back in appendix 4 to chapter III.6, I mentioned that, just as a current J” is associated with a point particle, 
acurrent J“” = — J” is associated with a string. It then follows that the analog of the electromagnetic potential 
A, coupling to J“ is an antisymmetric tensor field B,,, coupling to J”. Thus, string theory’ contains a 2-form 
potential B = 5B, ypdx"dx” and the corresponding 3-form field H = dB. For some readers, this remark may 
clarify further the geometric character of differential forms. 


Appendix 3: Spinors in curved spacetime 


This appendix is strictly for readers familiar with the Dirac spinor and should be skipped by others. In relativistic 
field theory, spin ; particles, such as the electron, are described by Dirac spinors. As I explained starting in part V, 
Einstein’s equivalence principle renders life easy for us. Given an action in Minkowskian spacetime (for example 
Maxwell’s action, in which various Lorentz indices are contracted with n,,, and its inverse), we simply replace n,,, 
by g,,, and, lo and behold, we obtain the action in curved spacetime. But this procedure works only with fields 
carrying Lorentz indices, such as Ay: The Dirac spinor, as the name indicates, does not transform as a vector 
or a tensor under the Lorentz group, but instead carries a spinorial index, which we denote’ by s. Let us write 


* Namely, Stokes theorem in a more sophisticated form. 
¥ It is commonly denoted by a, B, --- , but those guys have already been pressed into service in this chapter. 


IX.7. Differential Forms and Vielbein | 605 


the Dirac spinor as y,. In the Dirac action in Minkowskian spacetime, the spinor y, is acted upon by matrices 
known as Dirac gamma matrices (y®),, which, as indicated by the notation, are labeled by a Lorentz index 
but are matrices in spinorial space, thus carrying two spinorial indices r and s. (In other words, combinations 
such as (y%),;%; occur in the action.) 

The issue is how to promote the Dirac action in Minkowskian spacetime to curved spacetime. Of course, 
spacetime derivatives d,, also occur in the action, but we know how to promote them to curved spacetime, namely 
to covariant derivatives D,,. To make a long story short, and to keep the discussion at the most pedestrian level, 
we can state the problem facing us as follows: in constructing an action for the Dirac spinor in curved spacetime, 
how do we connect these three types of indices, spinorial (r,s, - - -), Lorentz (a, B, -- -), and world (w, v, +--+)? I 
already told you that the Dirac gamma matrices connect spinorial and Lorentz indices, so half of the problem is 
solved. 

Well, by now, you see how the other half is to be solved. The vielbein! Indeed, the vielbein e% connects Lorentz 
and world indices, and so the vielbein formalism is absolutely essential for the physics of the Dirac spinor in 
curved spacetime. 

If, in spite of my warning, some intrepid reader, though unfamiliar with spinors, has insisted on going through 
this rather cryptic appendix, I hope that he or she will go on plumbing the mystery of the spinor.® 


Appendix 4: Curvature and covariant derivative in the Cartan formalism 


The discussion in the text suggests defining the covariant derivative D = d + w. For definiteness, consider a 
0-form @°, that is, an object with a Lorentz index but without a world index. Write 


D«,o° =dgp% + wb? 


We now show how curvature emerges in the Cartan formalism. Calculate 


DY DY,o° =d (ag” 046") + oY, (ag a oo") 
The first term gives 

dd’ + (do",) $F — wl,dg? 
Since dd = 0, we have 

DY, D*,o° = (do, - oo) $° = Rb? 


We see the curvature 2-form R’, emerging in front of our very eyes. 


Exercises 


1 Fords? = f (y)*dx? + g(x)?dy?, calculate the curvature using differential forms. 


2 By writing out the components explicitly, show that dF = 0 states something you are familiar with but is 
disguised in a compact notation. 


3 Consider F = £ d cos 6 dg. By transforming to Cartesian coordinates, show that this describes a magnetic 
field pointing outward along the radial direction. 


4 Calculate the curvature of the conformally flat 2-dimensional space ds* = Q?(x, y)(dx* + dy?) using differ- 
ential forms. Check your result using the 2-sphere. 


5 _ Extend the calculation of exercise 4 to d-dimensional space, that is, calculate the curvature of the conformally 
flat space ds? = Q?(x!, +++, x“)((dx1)? +--+ (dx?) using differential forms. Check your result using 
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the sphere. Also, show that for Q(x1,---, x4) = 1/x!, the scalar curvature is constant. We will discuss the 
corresponding spacetime, known as anti de Sitter spacetime, in chapter IX.11. 


6 Calculate the curvature of the 2-dimensional space with ds? = dr? + (f(r, @))*d6 using differential forms. 
Recall that we showed, back in appendix 5 to chapter II.2, that we can construct a metric of this form in 
general for 2-dimensional spaces. 


Notes 


” 


1. Some authors write “dreibeins,” “vierbeins,” and “vielbeins.” Because the plural of “Bein” in German is 
“Beine,” these people are using the English plural of a German word. 

2. Some authors prefer the Greek words dyad, triad, and tetrad. As my friend Cecile DeWitt once said, “Why 
should German words be used for something discovered by a Frenchman?” It is odd indeed. One problem 
is the absence of a good substitute for vielbein: polytrad sounds a bit odd. Another term sometimes used is 
“frame field.” 

3. If we think of the vielbein as vectors ¢,,, namely d (d-dimensional) vectors labeled by the index 1, with the 
components of the vector é,, given by ef» then é,, are just the d tangent vectors we encountered before in 
chapter 1.7 in the context of surfaces. 

4. What mathematicians would call Grassmann variables. 

5. Our notation, using the same D, in D, V“ and D, V%, is somewhat sloppy. But at the level of this book, it is 
a small price to pay to avoid going into fiber bundles and other fancy mathematical topics. 

6. For the reader who wants to work through more examples of this statement, see B. Zumino, Y. S. Wu, and 
A. Zee, Nucl. Phys. B 239 (1984), p. 477, in particular, appendix A. 

7. In fact, string theory typically contains numerous p-forms. See J. Polchinski, String Theory. 

8. For further details, see more specialized texts. For an easy introduction, see QFT Nut, p. 445. 


IX.8 Differential Forms Applied 


Calculating curvature with differential forms 


In this chapter, I show you that differential forms provide a more efficient method to 
calculate curvature than the more traditional method of first working out the Christoffel 
symbols. The human mind appears to be ill evolved to handle 3-indexed objects like the 
Christoffel symbols. While the connection 1-form w also nominally carries 3 indices,* the 
indices are actually of two types, and so in reality, you only have to handle either a 1- 
indexed object or a 2-indexed object, depending on how you look at it. Besides, the 1-form 
wo = wr dx" is antisymmetric in wf, while a Christoffel symbol is symmetric in its two 
lower indices. In general, an antisymmetric object (lots of vanishing components!) is much 
easier to deal with than a symmetric one. 

People used to argue about the relative merits of the two methods, but with the advent 
of symbolic manipulation software, the issue is now moot. It is easy to write a simple 
program to calculate the Riemann curvature tensor by brute force using the traditional 
method. Still, Cartan’s method involving differential forms is well worth learning. 

First, let’s recall from the preceding chapter Cartan’s first and second structural 


equations: 

de+we=0 (1) 
and 

R=dw+o* (2) 


As the emphasis in this chapter is learning to compute with differential forms, I will 
write it with a minimum of prose connecting the equations. You might call this chapter 
engineering with differential forms. 


* If you expand it out into its components anh : 
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A reminder about the indices: the world index is w = 0, i; the Lorentz index is a = 0, a. 
For i, I often use its colloquial name, such as r or x. The symbol 0 does double duty. When 
I need to emphasize that a particular 0 is a world, rather than a Lorentz, index, I call it 
t, as for example in (22) below. Keep in mind that, with the (— + ++) signature used 
in this book, the raising and lowering of an a index entails a + sign, while the raising 
and lowering of a 0 index entails a — sign. If we are dealing with space (for example, the 
Poincaré half plane discussed below), rather than spacetime, then of course this — sign 
does not come in. 

Here are some useful relations based on antisymmetry: 


wo = +o =-0” =O (3) 
Om = +0" = —o = -05 (4) 
oo, = +0", 0", wo, = —0' wo”, (5) 
Rey: Roada Raoad = —Ra0 (6) 
Rovab = + Rabab = +Rbaba = Ro. (7) 


I hardly need say that if you want to learn to compute with Cartan’s approach, you will just 
have to work through everything here and do the exercises. 
In 2 dimensions, the curvature 2-form is just R = dw, since w* = 0 (note that w! w4, = 0). 


Poincaré half plane 


For ds* = (dr? + dx?)/r?, we have e! = dr/r, e* =dx/r, orin components, e} = 1 e = 1 
Thus, we have de! = 0 and de” = —drdx/r? = e’e!. Hence, using de” = —w’,e', we obtain 
ow}, = e?. We also obtain 

R}, = do', =de* = —e'e* (8) 
so that Ras = —1, from which we have 

Ry=—-1, R=! (9) 
that is, R,, = —5,,. Thus, the Poincaré half plane is a maximally symmetric space with 
constant negative curvature. Converting to R,,, = ene R,p, we obtain 

1 1 
Ry =>) Ry = 3G a 


that is, Ruy = —8yw giving R= —2. 


Expanding universe 


From 


ds? = —dt® + a(t)*(dx2 + dy? + dz?) (11) 
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we have 

e=dt, el=a(t)dx, e=a(t)hdy, e=alt)dz (12) 
and hence 

de=0=ae” (13) 


which implies 9, « e”, and 


de? = a(t)dtdx? = a(t)e°dx? = —a’,e° — w? & (14) 


c 


b 


xe, but then we cannot satisfy the antisymmetry of w”°, and so we 


We could have w 
conclude that w®* = 0. Thus, we have 


0, =o") = a(t)dx? (15) 
R° = do’, + ww, =ddtdx® +0= ~ ee (16) 


To guide you, I have indicated here (and in the following) that in this formalism, quite a 
few terms are equal to 0: 
2 
R? = dob + 004 + oo =0+0+ adx'dx’ = Seber (17) 
a 
Note that in this problem, there is a residual SO (3) symmetry in the spatial indices. Thus, 
R®, and R°, must be proportional to e°e” and e?e°, respectively: 


0 a , 
R yon = Pinas Ropob = Roovo = —R op9 (10 sum) (18) 
b a? 
Rye = «(to sum) (19) 
a 


As indicated, the repeated indices are not summed. We temporarily suspend the Einstein 
summation convention. Then we have 


: 
Roe => Ray = —3- (20) 
b 


oo +2 
0 . a a 
Roy = Rio, + > Rey = A es (21) 
é 


You should understand where the 3 and 2 come from. 
Now we can work out the nonvanishing components of the Ricci tensor: 


4 
Rit = ee? Roo = aes (22) 


o +2 
a a es . 
R= > ee Rpy =a" (: +25) 8; = (aa + 247)5;; (23) 
b 


Finally, the scalar curvature, which we can work out more easily from (20) and (21) than 
from (22) and (23), is given by 


ee 7 +2 . +2 
a a 


b a 
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Maximally symmetric 3-spaces 


Let’s foliate 3-space with a series of spheres separated by the distance F(r)dr: 


ds? = F(r)*dr? + r°d0* + r? sin? Odd? (25) 
so that 
el=F(r)dr, e=rdd, e&8=rsinddy (26) 


We obtain easily de! = 0, de* = drd6, de? = sin 6drdg +r cos Od6dg. 

Solving Cartan’s first equation for the connection 1-form gets more involved as the 
dimension of space (or spacetime) increases. In general, write w%, = w%, ax" = 04 e 
Plug this into Cartan’s first structural equation (1) and match terms. We cone 


1 1 1 1 cos 0 
1 2 1 3 F 2 3 
@,=—-—e=-—dd, w,=-—e=-—sinddg, w,=- e°’ =— cos 6d 27 
a PE F a aE F - 3 sind ed 
Cartan’s second equation gives us 
F’ F' 1 1 
L_ 1,2 1 _ 1,3 2_[4_ 203 
se a a ae aa R= (1 z) ave (28) 


Using differential forms, you have to exercise some judgment. For instance, the con- 
nection 1-forms are written in two equivalent versions in (27). To calculate R, itis somewhat 
easier to use the second version. 

1 FI 2 1 
ae R',,, =~, and R4,, = 4(1- 4). (Con- 
fused about the factors of 2, or absence thereof? See the preceding chapter.) Contracting 


From (28), we simply read off R+,,, = 


Riemann, we arrive at the nicely symmetric Ricci tensor 


7 
BUT po. (29) 


Rap = rF3 


If we require that the space be maximally symmetric and positively curved hie some 


length scale L, so A Rap = i Sab» WE OPI the diferenva) equation oe = hs ; fs the 
solution F? = (28), we find R1, = eo Ri 


— 
Fre'e3, and R’, = zye’e?. With the change of variable r = L sin x, the metric becomes 


ds? = L? (ax? + sin? yd62 + sin? x sin2 od¢”) (30) 
Here we have nothing other than 5%, of course, as you might recall from exercise 1.5.9. 


We can formally go to the maximally symmetric negatively curved space by setting 
L* + —L?, so that F? = —1,. With r = L sinh x, the metric is given by 


ds? =L? (ax? + sinh? yd6? + sinh? x sin? ede”) (31) 
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Spherically symmetric static spacetimes 


Start with the metric 
ds? = —E(r)'dt? + (F(r)*dr? + 17d6? + r? sin? ode”) (32) 


or e° = Edt, e' = Fdr, e? = rd0, e? =r sin 6d@. Note that the notation used in chapter 
VI.3 is related to that used here by A = E”, B = F?. 
After some slightly tedious matching of terms, we solve Cartan’s first structural equation 


to obtain 
2 cosé 3 1 1 3 2 
= =-—e, w,=0 
3 r sin@ 7 r ° 
1 E' 

1 2 1 0 3 

W@>5=—-—e*, o,= e, wo,=0 33 
OTR 0 EF 2 a 


Cartan’s second equations now give us 


1 1 F' F 
2 +f, !) 23 1_ 1,2 1_ 13 
ory (: n) ee, a ee No pa (34) 
and 
1 (E’\' E' E' 
Ria a [p= ele®, R24, = ee, Ro = ee® 35 
0” EF (=) 0 FEF 0” SEF2 (35) 


Note that spherical symmetry relates R% and R%, giving us a useful check on our arith- 
metic. 

One nice feature of the differential form approach is that once we have the curvature 
2-form, we can simply read off the Riemann curvature tensor: 


1 1 F’ F’ 
2 1 1 
Ry = 3 (1-3). Nia pane Raia a (36) 
and 
1 (EN E' E' 
a (a Ces Bip =2es 
Ryo= ge (4S) , © 00> 5p’ 030 = a (37) 
The Ricci tensor follows immediately: 
AMY (ke VA 8 ZR oh RENT 
Roo = + . Rea [= 38 
00” EF (=) rE F2 1 ;+F3. EF (=) (8) 
and 
1 1 F’ E’ 
Ro. = R33 = 1 + 39 
BRE ( 7) rF> rEF? (9) 


Similar to the discussion in chapter VI.3, we see that second derivatives appear in Rog and 
2E! 2F! 2 (E' , F! ; 
7EF. Tt pRB (= + +). Comparing the 


two discussions, we see that the appearance of this cancellation is packaged somewhat 


R11 but not in the combination Rog + Ry = 


more clearly here. The content of the two formalisms is of course the same, as we see 
immediately by the substitution A = E?, B = F? in (VI.3.5-7). 


612 | IX. Aspects of Gravity 


Anti de Sitter spacetime 


We will discuss de Sitter and anti de Sitter spacetimes in detail in chapters IX.10 and IX.11. 
Here it suffices to note that the metric 


a 2 2a 
ies dt“ + dx aay + dr (40) 
r 
generalizes the Poincaré half plane to (3 + 1)-dimensional spacetime. Here we have 
pies ge Pee #28, eee (41) 
r r r r 


It is useful to decompose the index set a = 0, 1, 2, 3 into a = @, 3, with a = 0, 1, 2, and 
to recognize that there exists a residual SO(2, 1) symmetry transforming the indices 
a =0, 1, 2 among themselves. The index 3 is special in this problem. It is convenient 
also to restrict the early letters a, b, and so on to denote 1, 2. 

The residual $O(2, 1) tells us that o%, = ke“ must have the form indicated. For the 
purpose of dimensional analysis here, we take r, x, y, and r to have dimensions of length, 
so that e® and hence ow", are dimensionless. Thus, k is dimensionless and by SO(2, 1) 
can only be a constant. Furthermore, SO(2, 1) and antisymmetry imply that w°“ = 0 and 


w'* = 0. These considerations render the arithmetic a snap. 
drdt 3,0 0 Wa _ 3 
re 


0 0 $05 
Thus, for example, de° = ere = —a',e* = —w ye”, giving us 


o®, =-—e° and hence wo, = —e (42) 


We can easily check the symmetry conclusion here by calculating de', for example. 
The same kind of symmetry considerations tell us that R®? = Cee? and Re = De®e?, 
with numerical constants C and D. Direct evaluation of Cartan’s second structural equa- 


tion gives C = D = —1. For example, R®, = dw®, + w° w% = —de® = —e%e3. Collect these 
results: 
Re = —e%e%, Re ——ele?, R®, ——e% 3, RS, = —e%e3 (43) 


R° -1=—-R 


b 
303 — ae = 1 a MO) 


(Here we have used (6) and (7).) Contracting Riemann, we obtain Ricci: 


R33 = Ry, + 9) R4,3=-1-1-1=-3 (45) 
a 
Roo = R30 + S Roa = t1+2= 43 (46) 
a 
0 b 3 
Rag = Rogq + Ring + Riyzg = —1- 1-1 =-3 (47) 


These results are summarized by 


Rap = —3op (48) 


The spacetime is maximally symmetric. 
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Exercises 


1 Warped polar coordinates: using forms, calculate the curvature for ds* = dr2+ f (r)2d62, with 0 =6 + 
2x an angular coordinate. Determine the f(r) that gives constant curvature. Verify that measuring the 
circumference of a small circle around the origin gives the same result. 


2 Consider the class of spaces described by ds* = y??dx* + x??dy?. Use differential forms to find the curvature 
as a function of p and determine the two values of p for which the space is flat. Hint: Euclidean space is a 
member of this class. Recall exercise VI.1.17. 


3 Using forms, calculate the curvature of the torus. Recall that the metric was worked out in exercise 1.5.16. 


4 For ds? = —dt? + A*(t)dx* + B?(t)dy* + C?(t)dz?, calculate the curvature using differential forms. Solve 
for the Kasner universe in exercise VI.2.1. Extend your work to higher dimensions. 
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Conformal transformation 


The reader learning Einstein gravity for the first time can safely skip over this chapter. I 
will use conformal algebra only in the next chapter and in the chapter on twistors, and 
then only peripherally. 

Recall that in chapter IX.6, an isometry is defined as a transformation x > x’(x) under 
which Big (x’) = &pq(x’). Suppose we feel more relaxed, and, instead, impose the more 
forgiving condition that 8 ae 2 (xg po (x’) for some unknown function Q. To use a 
terminology first introduced in chapter I.6, we do not demand that the two metrics g/,,, 
and g,,, are equal, but merely that they are conformally related. 

In other words, we ask whether g,,,,(x) axe is, = 07 (x')g po (x'), an easier-going version 
of (IX.6.1), has any solutions. As before, we content ourselves with an infinitesimal 
transformation x’ = x” + e€(x). In the small ¢ limit, we expand Q?(x’) ~ 14+ x(x’) = 
1+ ex(x) + O(e?). Collecting terms of order ¢, we find that this condition amounts to 
what is known as the conformal Killing condition 


Su Ope" + BpvdgE” + O80 + K8 pa = 0 (1) 


We can eliminate the unknown function «(x) by contracting with g°°, so that this leads 
to a condition on the metric g,,, and the vector field €, known as a conformal Killing 
vector field. For x = 0, the conformal Killing condition reduces to the Killing or isometry 
condition of chapter IX.6. 

As in chapter IX.6, we can write (1) more compactly as €,., + €5.5 +8 o = 0 using the 
covariant derivative, or as (recall (IX.6.4)) 


Le8uy =—Kguy (2) 
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using the Lie derivative! along the conformal Killing vector field €. (Also, as in chapter IX.6, 
we will often drop the word “field.”) 


Retreat to flat spacetime 


If someone hands us a metric, we could in principle find its conformal Killing vectors by 
solving (1). 

The simplest metric to deal with is the Minkowski metric, of course. In this introductory 
text, we are content to study this easy case, for which (1) simplifies to (with €, = ,,,¢", 
as usual) 0,6, + 056, + Kg = 0. Contracting this with n°°, we obtain « = —20 -€/d in 
d-dimensional spacetime. Hence the condition (1) becomes 


2 
InSo + IaSp a qo? ne (3) 


Infinitesimal transformations x“ = x + e&€(x) that satisfy (3) are said to generate the 
conformal algebra for Minkowski spacetime. (Clearly, with the substitution of 5,, for n,,; 
this entire discussion applies to flat space as well as flat spacetime.) 

Compare this condition with the Killing equation 0,¢, + 0,€, = 0 for Minkowski space- 
time, with the most general solution €“ = a“ + bx” describing translations and the 
Lorentz transformations, namely rotations and boosts. Here b“” = b!,n*” = —b’ is re- 
quired to be antisymmetric. (This goes way back to (1.3.7) and (III.3.13).) 


In search of conformal generators 


At this point, you can solve (3) for €“. Go ahead. Alternatively, we could wing it like a poor 
man. Stare at Minkowski spacetime: ds* = n,,,dx'dx". What transformations on x would 
change the metric conformally? 

By eyeball, we see that the scale transformation, or more academically, dilation,* x4 > 
x" for A. areal number, would do the job, since ds* becomes ds? = A?n,,,dx"dx". We have 
stretched spacetime by a constant factor. The conformal Killing condition is satisfied with 
Q(x) constant. To identify the corresponding €”, consider an infinitesimal transformation 
with A=1+ ec; then €“ =cx" with some (irrelevant) constant c. Sure enough, this 
satisfies (3), of course. Now, can you find another transformation? Think for a minute 
before reading on. 

The clever poor man notices that inversion, x“ = e?y/y?, would work." Plug in dx" = 
e*(8) y? — 2y, y")dy*/(y?)?. We obtain ds? = n,,,dx"dx” = (e4/(y?)*)n,,dy"dy”, which 
indeed is conformally flat. I introduced e to avoid confusing you, but now that its job is 


* Or dilatation, if dilation is not academic enough for you. 
+ Another (irrelevant) constant e, with dimension of length, is introduced here to ensure that x and y both 
have dimensions of length. 
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done, we will set it to 1 and define inversion as the transformation (for x? 4 0) 


“ 
xl a (4) 
You object, saying that the entire discussion has been couched in terms of infinitesimal 

transformations. The inversion is a discrete transformation and is in no way no how 

infinitesimal. How then can we identify the corresponding €"? 

Now the poor man makes another clever move: invert, translate by some vector a“, then 
invert back. For a“ = 0, the two inversions knock each other out, and we end up with the 
identity transformation. Thus, the net result of these three transformations would indeed 
be an infinitesimal transformation as a“ — 0. Let’s work out what I just said: 


xt xh xe xP x? 
7a 7a Uw p o 
BP ee »(S40") | nyo (S +e") (S40) 


=(S +e) / (3+ +e 
= x2 x2 x2 


= (x4 + ax?) /(1+ 2a +x +.a?x*)  (x" 4+ ax?)(1 — 2a - x) + O(a’) 
= xh + ay (nx? — 2xx*) + O(a?) (5) 


The transformation x > x” + a, (nx? — 2x/x*) is known as a conformal transforma- 
tion. You can verify that €” = a, (nx? — 2x/x*) satisfies (3), of course, since inversion 
and translation both satisfy the condition we started out with. 

As I said, you could have also simply solved (3) by brute force, and I am counting on 
you to have already done so. It is also instructive to act with 0° = n?0, on (3); we obtain 


da*é, = (2—d)d, (0 -é) (6) 


Applying 0°, we obtain further 47(8 - €) = 0 (all for d #1). 
We can now draw two important conclusions. 


1. The case d = 2 is special. We learn from (6) that any solutions of the generalized Laplace 
equation 37é, = O yield a conformal transformation. Indeed, for d = 2, either go to light cone 
coordinates for Minkowski spacetime or to complex coordinates for Euclidean space. With 
complex coordinates z = x + iy, we have (a2 + dbo = (0, + 10,)(0, — 10,)e, = whe 46, = 
0, and hence we can exploit the full power of complex analysis. This observation turns out 
to be of central importance in string theory.” For d = 2, there exists an infinite number of 


solutions of (3) for €. 


2. For d #2, these equations tell us that €” can depend on x at most quadratically. Thus, we 


have in fact found all solutions of (3) for d # 2, namely 
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E# = ah + bE x” + cx* 4 dy (n?x? — 2x!x”) (7) 


with b“” antisymmetric. We had noted all these terms already. Pleasingly, in (7), the constant 
term corresponds to translation, the linear terms to Lorentz transformation and to dilation, 


and the quadratic term to conformal transformation. 


Generators of conformal algebra 


Associated with each of these terms in (7), we have a generator of the Minkowskian confor- 
mal algebra. As in chapter I.3, itis convenient to use a differential operator representation. 
Recall that back in chapter III.3, by adding the generators* of translation P,, to those of 


Lorentz transformation J 


uv we extended the Lorentz algebra to the Poincaré algebra, de- 


fined by commuting 


P 


w= 9 


aw and Jy, = (x8, — x,0,) (8) 


By adding the dilation generator D and the conformal generator K", 
D=x"d, and K"*= (nix? — axle) dy (9) 


we can now, in turn, extend the Poincaré algebra to the conformal algebra, defined by 
commuting P,,, J, D, and Kei 

In other words, the commutators between P, J, D, and K generate an algebra that 
contains the Poincaré algebra. 

The commutators involving D are easy to compute: [D, x”] = [x“0,,, x”] = x"[0,,, x”J= 
x” and [D, 0,]=[x'0,,, 0,] =[x", 0,]d,, = —4,,. (To work out various commutators, keep in 
mind the identity [A, BC]=[A, B]C + B[A, C].) Evidently, D, as is sensible for a dilation 
generator, simply counts the length dimension, +1 for x” and —1 for 0,. Thus, [D, J,,,] = 
0, since J ~ xd has zero length dimension. Interestingly, another way of reading this 
is to write it as [J,,,, D] =0, which says that D is a Lorentz scalar. Next, we can read 
off [D, P“|= —P* and [D, K“)=+K" just by counting powers of length dimension 
(P~ 0, K ~xx0). 

The commutators involving K“ are not much harder to work out. First, [J“”, K*] = 
—n'*K” +n” K just tells us that K“ transforms like a vector, as expected. The nontrivial 
commutator is [K#, P*] = —[0*, (n#’x? — 2x#x”)d,] = —2(n¥x* — nx” — x4), = 
2(J4* + nD). Finally, verify that [K“, K”]=0. Can you see why? (Recall that we con- 
structed the conformal transformation as an inversion followed by a translation and then 
followed by another inversion.) 


* Here I omit the overall factors of i commonly included in quantum mechanics. As explained in chapter III.3, 
you and I live in free countries and, according to what is convenient in a given context, could include or omit 
overall factors at will. 
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Collecting our results, we have the conformal algebra 
P",P’}=0, [K", K"]=0 
D, P*}=—P*, [D,J,,)=0, [D, K4]=+K# 
ye P*} = —nl* PY 4 n* PH, [ye?, K*} = —n* KY 4+ KY 


ye J’) = = pve -_ ve yer ck nvr pee 2 th pre 


K", P*}=2(J"" + nD) (10) 


We see that, in some sense, K acts like the dual of P. 


Identifying the conformal algebra 


Now that we have used our eyeballs and brains, let’s use our fingers. Count the number of 
generators (P,K,D,J):d+d+1+ 5d(d -)D= 5(d + 2)(d + 1). Do you know a group 
with this many generators? 

Yes, SO(d, 2). Good guess! 

Remarkably, the conformal algebra of d-dimensional Minkowski spacetime with the 
Lorentz group SO(d — 1, 1)is the Lie algebra of SO(d, 2). The two algebras are isomorphic. 
The rule is that given SO(d — 1, 1), the conformal algebra is SO(d -—1+1,1+D= 
SO(d, 2): we “go up” by (1, 1). We can prove this assertion by the “what else could it 
be” argument. (The only uncertainty is the signature. Counting only tells us that it could 
be the algebra for the group SO(p, q) with p + q =d +2 and containing SO(d — 1, 1).) 
We can of course verify the assertion by direct computation and thus also ascertain the 
signature. 

Denote the generators of SO(d, 2) by JM", with M, N=0,1,2,---,d—-1,d,d+1 
(and nz, v=0,1,2,---,d —1) satisfying 


(yn, yPQ) = MP yN@ _ yNOyMP ah WP Me 4S Mo ynP (11) 


with n@¢ = —1 and n4+!-¢+! — +1. The isomorphism between the two algebras is almost 
fixed by symmetry considerations. We already have the generators of the Lorentz group 
SO(d, 1), namely J“”. Now we want to identify the additional generators D, P”, and 
K". By eyeball, we see that D is a scalar under SO(d, 1), and so it can only be J@-4+1. 
We identify /77+1 = D. Similarly, by eyeball, we see that P“ and K™ carry an index p, 
and hence are vectors under SO (d, 1). They can only be linear combinations of J“’4 and 
JH4+l So, let us make the educated guess J¢4 = (K+ P#)/2 and sdtl— 
(K# — P*)/2. We will check only a few commutators to show that this assignment is 
correct. For example, (11) gives 
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1 
[PF Pt Sgt pe ge qIKM + PMR + Pl 
= 4 ([{K", P*]—[K”, P#) = 4 (4s) 


where in the last step, we used (10). Similarly, 


een, Jee = at that pu = _ yey = [K“— P¥, K”— p’]=-1 (47"”) 


As another example, 
ga Pe ecient 


The poor man now speaks up, “It is easier to see through all this if we pick a definite 
d, say 6, and forget about signature, let it take care of itself. Just think about SO(6).” 
Evidently, J“”, for wu, v = 1, 2, 3, 4, generates the rotation algebra for SO(4). In addition, 
we have J“° and J”’®, clearly vectors labeled by 5 and 6. For uw ¢ v, they commute with 
each other, while for 4 = v, they commute to produce J°®. Recall, as we learned way back 
in chapter I.3, that (11) merely says that J“" and J?2 commute with each other, unless 
a pair of indices, one from each of the Js, are equal, in which case the commutator is a J 
carrying the remaining two indices. Thus, J°° commuted with J“">, and J“:° just turns 
one into the other. 


(1+1)-dimensional Minkowski spacetime in light cone coordinates 


It is instructive to work out the conformal algebra for a familiar spacetime written in not- 
so-familiar coordinates, namely the (1+ 1)-dimensional Minkowski spacetime written 
in light cone coordinates (as was described? in appendix 5 to chapter VII.2). Define 


x*=t +x. Then ds? = —dt? + dx? = —dx*t'dx~ =n,,dx"dx”, which tells us that n,_ = 
n_4=—5 and n*+~ =n-* = —2. (For this discussion, we adopt the convention that the 


components we do not display, such as **, all vanish.) Also, define 04 = x(2 + #), so 
that 0,x7 =1and @_x7~ =1. 

Then P+ =0*=20,, D=xtd,+x70_, JS GI = Z(xta7 — x7 94) = xt, - 
x~d_. Note that D+ J = 2x*d4 works out nicely. (It is understood that the + signs are 


correlated unless otherwise noted.) Can you guess what the conformal generators are? 
Let’s find out; simply evaluate (9): K* = x?nt~8_ — 2xt+ (vt, + x7 d_) = -2(a*)78,, 
and similarly, K~ = —2(x~)*d_. 

Rather elegantly, the 6 generators of SO(2, 2) can be taken to be 


OF, 80g and? 2 A) oy (12) 


Recall that in chapter VII.2, we constructed the Penrose diagram for M1, introduc- 
ing the compact variables X* by x* = tan X*. Note that ste = (1+ (x+)*)a,. In the 


620 | IX. Aspects of Gravity 


conformally equivalent spacetime described by the cylinder R x S!, the time coordinate is 
given by T = Xt + X~. Time translation along the cylinder is then generated by 


04 Cd) a 4 2 _)2 
Base t ge) =F +0464) a, + (x7) 2) (13) 

You can now check the algebra in (10). For example, (10) gives [K*+, P7-]=2(J*~ + 
nt D) =4(J + D) = 8x*d,, and indeed, we compute [K+, P~] =[—2(x*)74,, 23,]= 
8xt Ou. 


To the lost, angles are more important than distances 


Some readers are no doubt already aware of the many motivations—historical, mathemat- 
ical, and physical—for studying conformal transformations. Here I mention but a few. 
The key property is of course that conformal transformations preserve angles between 
line segments. 

When you are lost, it matters more to you to know that you are going in the right direction 
than to know how far you are from your destination. To the lost, angles are more important 
than distances. Gerardus Mercator (1512-1594) (or “Jerry the merchant”) fully appreciated 
this. As you worked out in exercise 1.5.3, the Mercator map of the world is obtained by a 
conformal transformation of the spherical coordinates (0, y) on the globe. Mathematically, 
I already mentioned the connection to complex analysis and the consequent implications 
for string theory. A humbler, but no less beautiful, physical motivation for studying the 
conformal map is the method of images in electrostatics. 

I cannot resist digressing a bit to remind you how it works. Consider the following 
problem. A charge q is located at Ré, in the presence of a conducting sphere of radius 
a grounded and centered at the origin. Calculate the potential ¢(r) at an arbitrary point 
7 =re. (The unit vectors é and €, point toward the observer and the charge, respectively.) 

We take the potential due to the charge g and subtract from it the potential due to an 
image charge g located at Ré,. (Note that we invoke implicitly a symmetry argument to 
locate the image charge along the vector ¢,.) Then we have 


q q _ q q (14) 


|ré — Re,| lré — Re,| rlé— $e, 


or) = 


Using the key observation that |é — Ké,| = |e, — Ke|, we see that we can satisfy the 


boundary condition ¢(r =a, 6, y) = 0 if we choose ‘ — d and a — x. The locations of 


the charge and of its image are related by an inversion R = a 

That this clever method works is due to the scale, or dilation, invariance of the Coulomb 
potential. It would not work, for example, with the short-ranged Yukawa potential « 
e—’ /r. In other words, in electrostatics, we are solving Laplace’s equation with appropriate 
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boundary conditions, and Laplace’s equation does not contain any intrinsic length scale. 
(Under inversion, we have to adjust the boundary condition accordingly; this is why we 
have to adjust the strength of the image charge.) 


Scale and conformal invariances in particle and condensed matter physics 


It is one thing to discuss the conformal algebra, but it is another to ascertain whether 
a given physical situation actually respects conformal invariance. In particle physics, one 
longstanding hope has been that at high energies, particle masses can be neglected, so that 
the physics would become scale invariant. It turns out that in a local field theory, it is true, 
more or less in general, that scale invariance typically leads to conformal invariance.* For 
example, we will check in appendix 1 that Maxwell’s action, ~ [ d*x FF“, discussed in 
chapter IV.2, is both scale and conformal invariant. In condensed matter physics, intrinsic 
length scales are typically washed out at the critical point between two phases, so that scale 
and conformal invariances come in full blast into the theory of critical phenomena.' 

I included this material on conformal algebra not so much because I will refer to it in 
the next chapter on de Sitter spacetime, but because the concepts involved are important. 
Indeed, as I am completing this book, the VV = 4 supersymmetric Yang-Mills theory is all 
the rage in the theoretical community. Not only does this theory have a conformal algebra, 
it also has a superconformal algebra. 


Appendix 1: Maxwell’s action is both scale and conformal invariant 


: axP . . . . . . 
Since A’ (x) = A,(x) oe , we have, under an infinitesimal transformation x’ = x — & with € arbitrary, 


SAy (x) = Aj, (0) — Ay (0) = (Al) — AUD) + (A, — Au) 
= EP A,A, + Ape? (15) 


(we are also suppressing ¢ and introducing a minus sign for convenience). The attentive reader will recall 
from chapter V.6 that this is just the Lie derivative C.A,,(x) defined in (V.6.26). Similarly, 6F,,, = Lz Fy) = 
£8, Fay + FoyOul? + FupdyE? 

We can now evaluate the variation of the Maxwell Lagrangian £ = — i FF, for an arbitrary €. (A trivial 
heads-up: the symbols £ and £; denote entirely different beasts.) We have 


8 (FY Fy) = 2FHY (EPI, Fy + Foydpe? + Fup dye?) 


= Io (€°F"’ F,,,) —a-€ (F“’F.,,) +4 FH FY 9, E, 


= 0, (6° FF.) + 2FM PF", (3é, + AE, — FMyd- é) (16) 


* This is because the violation of scale invariance and conformal invariance are both determined‘ by the trace 
Tit of the energy momentum tensor. Indeed, we saw a hint of this in our discussion of the relativistic gas and of 
the electromagnetic field in chapters III.6 and VI.4. 

T In both these types of physical applications, subtleties due to quantum and thermal fluctuations must be 
taken into account. 
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Thus far, € is arbitrary, but if € is a conformal Killing vector, that is, if it satisfies (3), then we obtain 
4 
dL =, (LEP) + (5 = i) (0-EL (17) 


For d = 4, and only for d = 4, the variation 5£ of the Maxwell Lagrangian under a conformal Killing transfor- 
mation is a total divergence, so that the variation of the action 5S = f d*x8L vanishes (with the usual suitable 
boundary conditions at spacetime infinity). Thus, in the spacetime we live in, the Maxwell action* is both scale 
and conformal invariant. (In particular, it is also inversion invariant, so you can go on happily using the method 
of images.) As a bonus, we also reconfirm what we have known for a long time, ever since part IV, that it is 
translation and Lorentz invariant. And of course, the last property is what got us started on this amazing epic 
journey toward the heart of spacetime. 


Appendix 2: Conformally related spacetimes 


For the sake of pedagogical clarity, we hastily retreated from the conformal Killing condition (1) in all its glory 
to its humbler flat version (3). Nevertheless, it is often useful to study what happens in an arbitrary curved 
spacetime. To be specific, let us look at Maxwell’s action Swaxwell(8uv» Aw) = — a f d?x.J=ggh” 9% Fug F yp from 
this alternative point of view. We have also indicated explicitly that Syaywen is a functional of g,,, and A,,. 

Now suppose that somebody hands you another metric g,,,,(x) = 27 (x)g v(x) conformally related to the metric 
we have. (Perhaps it is still worthwhile to emphasize that the two metrics are not related by a coordinate transfor- 
mation.) Since 34” = Q>2g” and g = Qg, we have Smaxwell(Bv» Ap) = aa f d4x /=ggh” 244 F Fy. 
Thus, for d = 4, and only for d = 4, we have Swaxwell(8zv» Ap) = SMaxwell(Suv> Ay): 

In general, if we are given an action in curved spacetime such that SQ 8,4, = S(Buyr °°), where the 
ellipses indicate various fields we are not touching at all (such as A,, in the specific example just given), we can 
immediately take the infinitesimal limit Q?(x) ~ 1+ ¢(x), so that 88 p(X) = Byv(X) — Spy (©) = €(X) 8 yy (x), and 
deduce 


5S=0= / dtx > _e(x)g,,00) =—} i dt x/=Belx) By p(x)! (x) (18) 
Suv (x) 


Since ¢(x) is arbitrary and local, we conclude that the trace T(x) = g,,,(x) T(x) = 0 vanishes, a result we have 
already derived in chapters III.6 and VI.4. 

Here it is important to let Q(x) depend on x, so that we can deduce from (18) that the trace T(x) vanishes 
locally, that is, at any x. But to demonstrate that Syaxwetl(8v» A,.) is invariant for d = 4, as we did a bit earlier, we 
could have taken a constant Q independent of x. In our demonstration, we merely counted powers of Q, and no 
derivative ever acted on Q. In other words, we could consider simply multiplying the metric by a constant: Suv > 
8,01 gt’? > Q-2gH” and g > 28g. At the risk of being repetitious, SMaxwell = — gf f d4x./=gg"” 9g Fi, Fyp 
is invariant because /—gg"'? 27? > Q4Q-7Q-*,/=gel” 9% — /—ggll’ g. 

But now we can make a connection with high school dimensional analysis. Scale x > wx, with w an ar- 
bitrary real number. Then we have 0 > w~!d. Gauge invariance requires that A » scales the same way as 4,,, 
so that A w~!A and F > w~*F. In fact, you see that the powers of w just correspond to the length di- 
mensions of various quantities.° For example, x has length dimension +1 (by definition), and F has length 
dimension —2. So, the invariance of Syyaxwen for d = 4 is just the statement that f d*x /—gg"”g°? Fug Puy > 
otw a? f d4*x.J/=ggt” 9? Fug F yp is dimensionless in length. By now, you also see the connection with the 
scaling in the preceding paragraph: the scaling /—g > *,/—g corresponds to the scaling d*x > w*d*x for 
the proverbial high school student, and the scaling g#”g°? > Q-*g#"g?? corresponds to the scaling Fy,g Fy > 
wo ‘F,44 Fy. Note that, perhaps amusingly, in the high school approach, we do not touch the metric, while in the 
conformal transformation, we touch only the metric. 

We now see that what we did in chapter VI.1 amounts to saying that { d*x./—gR in Einstein gravity has length 
dimension +2: four powers of length from d*x and two negative powers of length from the two as contained 
in R. 


* You should convince yourself that for d 4 4, the Maxwell action is manifestly not scale invariant. In contrast, 
Laplace’s equation, in any spatial dimension, contains no scale. 
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Exercises 


1 Showthat under inversion, (x, — x2)* > (x1 — x7)?/ (x?x3) and thus the separation between spacetime points 
is not preserved. However, if the two points are null separated, they remain null separated under inversion. 
Null separation is a conformally invariant concept. 


2 In the text, the poor man realizes that inversion x = e?y/y? of the Minkowski metric gives a conformally 
invariant metric. How about the transformation x” = f*y/(y*)*, with f a constant with dimensions of 
length? 


3 Sometimes the best way to learn a formalism is to apply it to a trivial problem to which we know the answer. 
Determine the conformal Killing vector fields of the Euclidean plane. 


Notes 


1. If €¢ denotes a set of conformal Killing vectors for a = 1, --- , n for some n, you can show that the commu- 
tators [Cza, Lz] generate an algebra known as the conformal algebra. 

2. See J. Polchinski, String Theory, chapter 2. 

3. The coordinates (p, q) and (P, Q) there correspond to x+ and X* here, respectively. 


4. For a concise statement of when scale invariance implies conformal invariance, see Y. Nakayama, “Gravity 
Dual for a Model of Perception,” http://arxiv.org/pdf/1003.5729. 

5. You may recall that we did this kind of scaling to check our computations back in chapter VI.2. 

6. Looking ahead, we will be using this sort of reasoning in chapter X.3. 
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Which curved spacetime is the most lovable? 


Of all the curved spaces, we love the sphere most. This is of course due to the high degree of 
symmetry enjoyed by the sphere in all its manifestations, including the circle. In particular, 
every point on the sphere is identical to any other point: the sphere is ahomogeneous space. 
Indeed, as explained in chapter IX.6 on isometry, it is maximally symmetric. 

Among the curved spacetimes, which one should we love the most? Which spacetimes 
are closest to the spheres? Kepler talked about the music of the spheres; we’ve become 
somewhat more sophisticated 400 years later. 


De Sitter spacetime 


The d-dimensional sphere S“ of radius L is defined as the set of all points (X1, X?,---, 
X@+1) in (d + 1)-dimensional Euclidean space E+! (that is, a space with ds? = (dX!)? + 
(dX?)? +--+ + (dX4)? + (dX4*1)”) satisfying 


(X22 4 (x22 4-6-4 (X42 4 (X42 = 1? (sphere S4) (1) 


In analogy, let us define the d-dimensional de Sitter spacetime!* dS¢ with length scale L 
as the set of all points (X°, X!, X*,---, X% in (d + 1)-dimensional Minkowskian space- 
time M@1 (that is, a spacetime with ds? = —(dX°®)? + (dX)? + (dX2)27 +--+ + (dx“)?) 
satisfying 


—(X°)* + (X12 4 (X%)2 +--+ (X92 = L? (de Sitter spacetime dS“) (2) 


We have renamed X“+! as X° and by a feat of “imaginary magic” turned it into a time 
coordinate. Thus, de Sitter spacetime is sort of a Minkowskian version of the sphere living 
in Minkowski spacetime. 

This flip of sign makes all the difference in the world: at a given X°, the spatial 
coordinates (X!, X*,---, X“) forma (d — 1)-dimensional sphere S¢~! defined by (X!)* + 
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Xi i=1,++-,d 


Figure 1 The d-dimensional de Sitter space- 
time dS¢ embedded in (d + 1)-dimensional 
Minkowskian spacetime M@1. 


(X*)* + -+-+ (X%* = L? + (X°)*. Topologically, de Sitter spacetime is then R x S4~!: as 
the time coordinate X° goes from —oo to oo, the radius //L?+ (X°)2 of S¢~! starts at 
infinity, contracts toa minimum value L, and then expands again to infinity. See figure 1. 
Contrast the circles of constant latitude on the globe, expanding from the south pole to the 
equator and then contracting again toward the north pole. In the (X°-X“) plane, we have 
a hyperbola, and so dS“ can also be regarded as a hyperboloid of rotation. 

A word of caution about figures of this type: We naturally tend to look at it as if 
it were drawn in Euclidean space, while in fact, dS¢ is constructed in Minkowskian 
spacetime M@1. 


Maximal symmetry and coset manifold 


The isometry group of S@ is clearly SO(d +1), the rotation group of the embedding 
space E“+1, with the Killing generators (x! 2, - x" —3,), M,N=1,2,+--, X44) 
The sphere S@ can thus be regarded as the coset manifold SO(d + 1)/SO(d), where the 
quotient group SO(d) is the subgroup of SO (d + 1) that leaves a point on S@ invariant, as 
we discussed in chapter IX.6. (Think about this for d = 2.) 

Evidently, the isometry group’ of de Sitter spacetime dS“ is SO (d, 1), the Lorentz group 
of the embedding space M“@’!. The Killing generators fall into two sets, d-dimensional 
rotations and boosts: 

xM x and x + XO, for M,N =1,2,---,d (3) 
obtained by letting X“+! + iX° formally. Note the sign flip between the two sets, just as 
in the familiar Lorentz algebra of special relativity. 

Hence, just like the sphere S“, de Sitter spacetime is also a coset manifold: dS¢ = 
SO(d, 1)/SO(d — 1, 1). Consider, for example, the point X, = (X°, X1, X?,---, X%), = 
(0,0,-+-,1) on dS*: it is left invariant by the subgroup SO(d —1, 1), namely the 
Lorentz group acting on the d coordinates (X°, X!, X*,---, X4~1). In particular, dS*+ = 


SO(4, 1)/SO(@3, 1). As a small check, note that SO(4, 1) has 4 = 10 generators, while 
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SO(3, 1) has t3 =6 generators, so that SO(4,1)/SO(3,1) is indeed 10-6=4 
dimensional. 

The group SO(d, 1) moves points on dS“ around. We thus conclude that, just like the 
sphere, de Sitter spacetime is maximally symmetric. So, according to the general theory 
of maximally symmetric spaces explained in chapter IX.6, the Riemann curvature tensor 
Ryyro Must be equal to (84,810 — Suo8va) Up to an overall constant. (Here the Greek 
indices* range over pp = 0, 1,---,d — 1.) Now notice that we have constructed de Sitter 
spacetime but have yet to specify a set of coordinates on it. 

Confusio: “Isn't (X°, X!, X?,---, X%) a set of coordinates?” 

No, that is a set of coordinates for the ambient Minkowski space M@1. 

Suppose we choose our coordinates on dS“ to have dimensions of length so that Bay 18 
normalized to be dimensionless. Then by dimensional analysis, we must have 


1 
Ruvic = 72 ur8vo i Su 8va) (4) 


(We will show presently one way of determining the overall numerical coefficient.) The 
Ricci tensor, the scalar curvature, and the Einstein tensor are fixed, upon contraction of 
the indices in (4), to be 


_@-) ei a aii pa = Dd -2 


Ruy p2 Suv =e pv = uv — 78yv 972. Sav (5) 


respectively. Since (4) and (5) are equalities between tensors, they hold in every coordinate 
system. 


Calculating the Riemann curvature tensor for de Sitter spacetime 


One simple way to coordinatize de Sitter spacetime is to eliminate W = X¢ (precisely as we 
did in appendix 2 to chapter I.6) and use X” with w = 0, 1, --- ,d — Las coordinates. Start 
with W* = L? — X - X, where for convenience, I introduce the notation A - B = n wank. 
Then WdW = —X -dX, so that dW? = (X -dX)?/(L? — X - X), leading to 


ds” = nyydX"dX" + dW? = nyydX"dX” — (X -dX)*/(X +X — L’) 


AYP 
_ Nr MvpX x yyv 
. (nw oe re 6) 

We can now calculate the Riemann curvature tensor for de Sitter spacetime. As I just 
mentioned, we merely have to determine the overall coefficient in (4). Here is a rather 
nifty* approach. Let X > 0 in (6), so that Cg ay TaN pxNvpX*X?); in other words, 
the metric is locally flat at X“ = 0. But in chapter VI.1, we learned how to determine 
the curvature tensor in locally flat coordinates. Looking at (VI.1.11), we immediately 


* You should realize that the index sets (0, 1,---, d — 1) and (0, 1,---, d — 1, d) are conceptually distinct 
and are labeled by different letters. (For example, for S 2 we have coordinates (6, y) on the sphere and (X, Y, Z) 
for the ambient embedding space.) But there are only so many letters and thus, a given set of letters often does 
double duty. 
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read off Buy rp = 572 (NuaNyp + NupNva,). We then plug into (VI.1.14) to obtain R,5,., = 
Fe wNpv —NzvNpy) AS usual, we simply promote 7,,, to g,, to obtain R,,,, at an 
arbitrary point. We have fixed the overall coefficient in (4). Nifty, eh? To summarize, we 
eliminate W and discover that the resulting coordinate system is locally flat. We look up 
chapter VI.1 and fix the only feature of (4) that does not follow from general principles. 


The expanding universe once again 


We see from (5) that de Sitter spacetime is a solution of Einstein’s field equation R,,, = 
82 GAg,,, with a positive cosmological constant (see VI.5.14) given by 

81GA = ~ (7) 
To the extent that* ~74% ~ 100%, we can say that our universe is observed to be almost 
maximally symmetric and de Sitter. (As was explained in chapter VIII.1, this approximate 
statement only applies to the future, not to the past.) 

Topologically, de Sitter spacetime is R x S+, with a spatial section given by S3, as just 
explained. In contrast, we know from chapters VI.2 and VL.5 that Einstein’s field equation 
with a positive cosmological constant leads to ds* = —dt? + e?4' (dx? + dy? + dz*) with the 
Hubble constant given by H = (85% A) z or interms of the de Sitter length, H = 1/L. Thus, 
an interesting question poses itself: How is the 3-dimensional flat space with coordinates 
(x, y, z) hidden in the (3 + 1)-dimensional “Minkowskian sphere” defined in (2)? It must 
correspond to a rather nontrivial slice. Seems like quite a surprise that this “Minkowskian 
sphere” contains an exponentially expanding universe! Indeed, can you figure it out before 
reading on? 


Angular coordinates on de Sitter spacetime and hyperbolic spaces 


For ease of writing and for definiteness, I now specialize to d = 4. Our universe might very 
well be described by dS* to a good approximation, as discussed in chapter VIII.2 and in 
the preceding section. (Wait! Aren’t you supposed to figure out something before reading 
on?) As we go along, you should, and could easily, work out dS“ for any integer d. (When 
confused, the beginner should also work out what happens for d = 1, 2, 3. From (2), we 
see that a spatial slice of dS“ at fixed X° is just the familiar sphere for d = 3, the circle for 
d = 2, and two points for d = 1.) 

For d = 4, it is convenient to relinquish indices and name the coordinates: X 9_T, 
X!=X, X?=Y, X3=Z, and X* = W, so that the defining equation (2) reads 


—T?4+ X74V4274W=L? (8) 


* Dark energy amounts to ~74% of the universe; see chapter VI.2. It better not be 100%, in which case there 
would be no matter to form physicists out of. 
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For X, Y, and Z, we can go over to the usual spherical coordinates X =r sin @ cos g, 
Y =rsin6@ sin gy, and Z =r cos 9@, so the defining equation becomes 


—T+r4W=L? (9) 


(As always, dX? + dY¥? + dZ? = dr? + r7dQ}, with dQ}, = d6? + sin’ 6d¢’.) 

Before attacking dS‘, let’s warm up with S?, a somewhat less familiar sphere than S?. 
As explained in chapter I.6, the metric on S? with radius L is induced from the Euclidean 
metric ds? = dX? + dY* + dZ* + dW? of the embedding space E*. We eliminate dW by 
differentiating the defining equation X* + Y* + Z? + W? = L?, sothat -WdW = XdX + 
YdY + ZdZ =rdr, using the usual spherical coordinates for X, Y, and Z in the last step. 
Then we have dW? = (rary? = nar’. Thus, we obtain ds? = dr? + r7dQ3 + a dr? = 
a 2 dr? + r?dQé (as in 1.6.11). This expression literally invites us to introduce spherical 
coordinates for S? by setting r = L sin y (note that y is a latitude like 6 and so ranges 


from 0 to z, while ¢ is a longitude ranging from 0 to 27), so that 
ds? =L? (dy? + sin’ ydQ3) = Lag} (10) 


We just constructed S$? out of S?, thus rediscovering what we have known since chapter I.6: 
the metric for S“ can be constructed iteratively. Indeed, a bright school child would have 
realized that the sphere S$? can be built out of circles $1. 

After this exercise with S?, we are now ready to induce, in exactly the same way, the 
metric on the de Sitter spacetime dS* from the metric ds* = —dT? + (dX? + dY7+ 
dZ* + dW*) = —dT? + (dr? + r7dQ} + dW”) of the embedding Minkowski space M*:. 
Differentiating the defining equation (9), we have WdW = TdT —rdr, so that dW? = 
(TdT — rdr)?/(L* + T? — r’). Clearly, we are invited to introduce a hyperbolic angle w 
by T =t cosh wy and r =¢ sinh y, so that T? —r? = 12, TdT —rdr =tdt, dT? — dr? = 
dt* — t?dw?, and dW? = t*dt*/(L? + t*). We obtain 


L? L? 


Dee, 2, 2 2 einh2 2\ 24 427772 
ds? =— gat +t (dy? + sinh? yd23) = a + aH; (11) 
where 
dH; = dw? + sinh* pdQ} (12) 


is the line element on the 3-dimensional hyperbolic space H? discussed in chapter I.6. 
Compare and contrast (10) with (12)! The (3 + 1)-dimensional spacetime dS* is here 
coordinatized by (t, w, 0, g). (Clearly, the angular coordinates are just going along for 


the ride, so that, for example, for dS?, we have ds* = — nat? + (dw? + sinh? wdo?), 
and for dS’, the surface pictured in figure 1, ds? = — nat? + dw.) 


Just as clearly, we can obtain various related forms by changing variables in (11). For 
example, set tf = L tan 6 to obtain 


2 
ipa (—a0? + sin? 6 dH?) (13) 
cos? 6 
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These coordinates are defined by 


L 


T=Ltan@coshy, r=Ltan@sinhy, W= (14) 
cos 6 
Another form, 
ds? =L? (—ap’ + sinh? p dH) (15) 
is obtained by setting t = L sinh p, so that 
T =Lsinhpcoshy, r=Lsinhp sinhw, W=Lcoshp (16) 


Note that for all these coordinate choices, the slice at constant r (that is, constant 6 in 
(13) and constant p in (15)) is now a hyperbolic surface (recall 1.6.14) defined in (12). How 
about this spacetime? Do you recognize it? 

Yes, it is the open universe described in chapter V.3 for a particular cosmic expansion 
factor a(t). 


De Sitter spacetime wears many disguises 


As we will now see, we can write de Sitter spacetime in remarkably many? different forms, 
according to various choice of coordinates. Some, such as (13) and (15), are obtained by 
more or less obvious changes of variables. Others, such as the exponentially expanding 
universe, are far from obvious. For convenience, I list the many faces of de Sitter spacetime 
in the table near the end of this chapter. 

Before we start going through these different choices, we should forewarn Confusio that 
there are only so many suitable letters in the alphabet, T and ¢ for time, R and r for the 
radial coordinate. Inevitably, we are bound to use the same letter for conceptually different 
entities. We will have to trust Confusio to distinguish them by context. 

For our first alternative coordinate choice, instead of S?, we can go to S?, setting 
r=Rsin wy and W = R cos yf in the defining equation (9), which then becomes 


-T?+R?=L? (17) 


In other words, we set R* = X* + Y? + Z* + W? in (8). Space is now described’ by a series 
of spheres 5? with radius R > L, since R? = L? + T2, as shown in figure 1. Then, as per 
the above discussion, ds* = —dT* + (dR? + R*dQ3). Solving R* = L? + T? by writing 
T =Lsinht, R= L cosht, so that again, as is familiar from Minkowskian geometry, we 
have dT* — dR? = L’dt?. We thus obtain the metric 


ds? = L? (—dt® + cosh? t dQ? 18 
3 


* The rich man would say that space is foliated by spheres with R > L. 
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As was just described, a fixed t slice corresponds to a fixed X° slice of the hyperboloid in 
figure 1, giving for the spatial sections a series of spheres S? with radius varying as cosh f 
as t ranges from —oo to oo. These coordinates, defined by 


T=Lsinht, r=Lcoshtsiny, W=Lcoshtcosy (19) 


cover the entire hyperboloid and are thus known as global coordinates. 

Do you recognize this spacetime? Yes, it is the closed universe described in chapter V.3 
for a particular cosmic expansion factor a(t). 

You could now go on to explore some other coordinate choices. 

As another example, start with (9) and write ds* = —dT? + (dr? + r7dQ} + dW”). De- 
fine T = p sinh x andr = p cosh x, so that (9) becomes p* + W? = L* and dT? — dr? = 
pdx? — dp?. Note that x is a time coordinate and p a space coordinate. As T ranges from 
Oi = hep 
(analogous to what we had above for the sphere). Putting it together, we obtain 


—oo to +00, the time variable x ranges from —oo to +00. Then dW2= 


1 
ds? = —p?dx? + ( 7 dp? + p? cosh? x ) 
y peer ae 


= L? (= sin’ ydy? + dy? + sin” y cosh? x 427) (20) 
where we have written p = L sin w, so that 


T=Lsinysinhxy, r=Lsinywcoshx, W=Locosy (21) 


Expanding flat universe as a de Sitter spacetime 


I now finally come to the question raised earlier. Perhaps you have solved it already? 
Recall that by maximal symmetry, we know that de Sitter spacetime describes an Einstein 
universe driven by a positive cosmological constant. But the metrics we have shown thus far 
do not appear to look anything like the usual exponentially expanding Friedmann-Lemaitre 
form we first met in chapter VI.2. 

It turns out that the planar coordinates (t, r, 0, y) (note: different t and r from before!) 
of the exponentially expanding universe are defined by 


X°=L (sinh ¢ + pret) , XM=Lreo, Xt=L (cosh 1 - ret) (22) 


with i = 1, 2,3 and w'=sin 6 cos g, w? = sin @ sin gy, w® = cos @ (recall the appendix to 
chapter I.7 from way back). Or, in Cartesian coordinates x! = (x, y, z), write X' = Le'x'. 

So, did you figure it out? You might not have readily guessed this rather bizarre and 
seemingly totally asymmetric coordinatization. In appendix 1, we will provide a group 
theoretic interpretation. (Of course, you could have found it by brute force. See appendix 2.) 

At this stage, you are merely invited to plug and chug. First, check that the embedding 
equation (2) is satisfied. Second, show that, inserting (22) into the flat metric ds? = 
nundX™dX% in the embedding spacetime, the metric in our (3 + 1)-dimensional world 
has, lo and behold, the nice form 


IX.10. De Sitter Spacetime | 631 


ds? = L? [=a + 6% (dr? + do) =I? [=a + 6% (dx? + dy? + dz*)| (23) 


Remarkably, this is precisely the expanding universe that we discussed in chapters VI.2 
and VIII.1, and in which we may be living. Our universe may well be a Minkowskian 
sphere endowed with a sense of time! With these coordinates, a constant t slice gives flat 
Euclidean 3-space (as already noted in chapter V.3), hence the name “planar” coordinates. 

For the familiar sphere, the embedding Cartesian coordinates (X, Y, Z) show the isome- 
tries but are not convenient to compute with, while the metric in spherical coordinates 
ds* = d0? + sin? @dg* hides the isometries, but these coordinates are better for many pur- 
poses. Similarly, while the embedding coordinates (2) display the isometries transparently, 
for various purposes other coordinates may be more convenient. For example, for cosmol- 
ogy, the planar coordinates in (23) are clearly appropriate, but they hide the underlying 
isometries. Note that we do not need to know the rather complicated transformation (22) 
at all to study cosmology. Indeed, we could have discovered, and did discover, (23) without 
talking about isometries and maximally symmetric spaces and spacetimes. It is, of course, 
illuminating to understand that the exponentially expanding universe we may be living in 
is nothing other than a glorified sphere. 


lsometries and light cone coordinates 


We started out in the embedding space M“’! with a manifold endowed with plenty of 
isometries. When we descend to a specific description tied to a particular coordinate 
choice, these isometries are still there, though harder to see. The metric in (23) provides 
a good example. It depends explicitly on time and so is definitely not invariant under 
time translation; however, it is invariant under t > t + ¢ accompanied by x! > x! — ex!: 
ex? > e*(1— €)*e* dx? ~ e*dx* + O(e7). Inother words, we havea Killing vector 4 = 
(1, —x!). This amounts to a fancy mathematical way of saying that the universe expands 
exponentially! The Killing vector tells us that the passage of time can be compensated for 
by shrinking the spatial coordinates. 

Note that g,,,¢“€” = —1+ e”r*. The Killing vector stays timelike only in the region 
e'r < 1but becomes spacelike outside. Recalling our discussion of the Schwarzschild black 
hole, we recognize this switch of the Killing vector from timelike to spacelike as a hallmark 
of a horizon. 

The reader with a good memory will recall that e’r < 1 describes the region enclosed 
by the de Sitter horizon we deduced in chapter V.3 by physical reasoning! In that chapter, 
sitting at r = 0, we sent a message at time ¢ to a friend located at r. The condition that she 
will receive our message was precisely e‘r < 1. For us, sitting at the origin, our de Sitter 
horizon is defined by 


er=1 (24) 


Now, stare at (22). What does it suggest to you? 
Noting the form of X° and X*, we realize that it would be wise to go to light cone 


coordinates X+ = X° + X*. In particular, Xt = X° + X* = Le’, an equation that describes 
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a plane labeled by t in M@’!. Thus, at a given f, our universe is the intersection of the 
Minkowskian sphere (2) with this plane. As ¢ varies from —oo to 00, the plane marches 
upward, and the universe expands. (For instance, for 6 = 0, the universe at the instant 
t = 0 consists of the surface X = L(}r?, r cosy, r sin gy, 0, 1— 5r’).) 

So, rewrite (22) as (setting L = 1 for convenience) 


Xt=e'" Xi=ralée =x'e, X7~ =e (re — 1) (25) 


This light cone construction suggests, at least in hindsight, one way for the poor man 
to proceed. Suppose that the poor man does not know about the expanding universe but 
is merely possessed by an unspeakable desire to slice the hyperboloid in figure 1 with 
“lightlike” planes Xt = A(1), for A some unknown function of some time coordinate. By 
rotational invariance, he writes X! = o(t)A(t)x', with o another unknown function of f. 
Then, he uses the defining equation —X*+X~ + X? =1 to determine X~ = 0?Ax? — 71. 
Plugging all this into ds? = -dXtdX~ + dX ? he finds that he can get rid of the cross 
term x - dxdt by setting o = 1 (and absorbing an irrelevant constant into x). Choosing t 
to be such that go = —1 fixes A(t) and gives the flat expanding universe in (23). 

Since in this embedding, X* is always positive, the coordinates (22) cover only part 
of de Sitter spacetime. In particular, referring back to (22), we see that (again setting L 


to 1 for convenience) for t > oo, X*=cosht — are! > (1 — r)e', while for t > —oo, 
Xt > je ~ +00. Thus, as t ranges from —oo to oo, for r2 > 1, the coordinate X* ranges 


from oo to —oo, but for r? <1, it only ranges between oo and /1—?, reaching its 
minimum value when e~ = 1/(1 — r?). 


Poincaré half plane and temporal boundary 
To make contact with observational cosmology, lett > t/L, x > x/L, and so forth in (23), 


and write ds* = —dt? + e*' (dx? + dy* + dz*). As expected, the Hubble constant H = 1/L 
is just the inverse of the de Sitter length. 


Introducing the conformal time u by u = —e—'/H, we obtain another useful form: 
1 
2_ 2 2 2 3 
ds* = ( poet + (dx* + dy* + dz*)| (26) 


Notice that as the cosmic time t runs from —oo to +00, the conformal time u runs from 
—oo to 0. (Often it is more convenient to work with v = —u = e~"'/H, even though as t 
runs from —oo to +00, the time v runs backward from oo to 0.) 

Remembering chapter I.5, you realize that de Sitter spacetime is a Minkowskian version 
of the Poincaré half plane! Or, the Poincaré half plane is de Sitter spacetime Euclideanized. 
Just as the Poincaré half plane has a boundary, de Sitter spacetime has a temporal bound- 
ary* at u = 0, corresponding to t = oo. Ata fixed u = 0", the boundary consists of Euclidean 
3-space with no time. 


* This feature of de Sitter spacetime has attracted a great deal of attention. 
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The Poincaré coordinates, just like the closely related planar coordinates, cover only part 
of the Minkowski sphere in the embedding spacetime M*". 


Different slices give closed, flat, and open universes 


We just saw that de Sitter spacetime describes an exponentially expanding flat universe 
driven by a positive cosmological constant. 

Earlier, I asked you whether you recognized the spacetimes described by (15) and (18). 
Well, you might have if you did exercise VIII.1.2. They describe, respectively, an open and 
a closed universe driven by a positive cosmological constant. To see this, we need to see 
through various disguises. 

For convenience, I write the two spacetimes here again (with a trivial change of notation): 


ds? = L? (-a? + sinh? t dH) (open) (27) 
and 
ds? = L? (-a?? + cosh? ¢ a9) (closed) (28) 


Note, once again, that space is spherical for the closed universe and hyperbolic for the 
open universe. 


First, for the closed case, go back to the definition L7dQ? = Endr? + r7dQ3 in the 


discussion leading to (10). Putting this into (28), we obtain 


2 
r 
ae 


2 
Pla ome (cosh +) ( : dr? + ase (closed) (29) 


after scaling t > t/L. 
For the open case, go back to (12) and set sinh y = r (note to Confusio: not the same r 
as in (16)) and so dy? = dr*/(1+ r2). We then obtain (again after suitable scaling) 


ds? = —dt? + (sinn ) ( : 5dr + eae (open) (30) 
L Le 

Since I am writing a textbook, I felt obliged to drag the de Sitter length around, at least 
for a while, but now I am finally fed up. Henceforth, let us use L as the length unit and 
set L = 1. 

It is instructive to make contact with the cosmological equation determining the scale 
factor a(t) (defined in chapter VIII.1, as you may recall). The functions a(t) = cosh r, 
a(t) =e', and a(t) =sinht, in (29), (23), and (30), respectively, satisfy the elementary 
identities sinh? t + 1= cosh? t, e” = e*', and cosh? t — 1 = sinh’ t, respectively. But these 
represent just the three versions of the cosmological equation (written in appropriate units) 
for universes driven by a cosmological constant, namely a? + k = a? fork = +1, 0, and —1, 
respectively. The universe boiled down to elementary identities! 

As was explained in chapter VIII.1, a universe consisting purely of a cosmological 
constant could evade the argument given there that a Big Bang was inevitable in the past. 
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For k = +1, the equation just cited indicates that a cannot drop below 1. For k = —1, there 
was a Big Bang: the right hand side becomes negligible for small t and a grows linearly 
from the Bang. Interestingly, the closed, flat, and open universes driven by a positive 
cosmological constant correspond to different coordinatizations of de Sitter spacetime: 
different spatial curvature but the same A. As was also emphasized in chapter VIII.1, 
these are mathematical, rather than physical, universes, as any amount of radiation or 
matter would dominate over A in the past near the Big Bang. 

Let us now track down the coordinate transformation that led to (15) and (18) and 
compare it with the corresponding transformation (22) in the flat case (which I rewrite 
here for convenience): 


xo= (sinh e+ re!) , Xa (cosh 1 - ret) 

Maree. ~tai33% Gap (31) 

For the closed cosmological constant-dominated universe, we have (see (19)) 

X%=sinht, X*=J71—r2cosht 

X'=rcoshtoa’, i=1,2,3 (closed) (32) 

For the open cosmological constant-dominated universe, we have (see (16), with a trivial 
change of notation) 

X°= /1+4+ r2sinht, X* =cosht 

X'=rsinht a’, i=1,2,3 (open) (33) 

You might ask how the 3-dimensional space we live in, at a given instant in f, is 
embedded in the hyperboloid I refer to as the Minkowskian sphere. To ease visualization, 
you may wish to specialize to dS?. 

For the flat case, we already mentioned that space consists of the intersection of lightlike 
planes with the hyperboloid. 

For the closed case, a given instant in t corresponds to a slice of the hyperboloid 
at a fixed X°. Space is just a circle around the hyperboloid. In particular, for t = 0, 
X = (0, V1—r?, r), understandably just the circle around the “waist” of the hyperboloid. 


For the open case, a given instant in ¢ corresponds to a slice of the hyperboloid at a fixed 
X*. Space at t = 0 degenerates into the point X = (0, 0, 1). It’s the Big Bang! 


Static coordinates 
For another interesting coordinate system, solve the defining equation (9) —T? + r?+ 
W? = 1 by writing 

T=V1-r*sinht, W=¥v1-r*cosht (34) 


and thus obtain 


dr? 


dst =—(1-r)ar+ 5 +e (35) 
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Remarkably, the metric has the same form as the Schwarzschild metric and furthermore 
is static. The time dependence has disappeared. In other words, t > t + constant is an 
a 
vector is timelike only for r <1, that is, inside the de Sitter length. We see that r= 1 
defines the de Sitter horizon. Indeed, the embedding (34) holds only for r < 1. Note also 
that T+ W=VJ1—r2e! > 0 and -T + W=VJ/1—r2e~' > 0, so that these coordinates 


cover only one quarter of the spacetime, namely the region W > |T|. Another quarter is 


isometry, or in somewhat fancier language, = is a Killing vector. Note that this Killing 


covered by flipping the sign of W in (34), leading to the same metric. 

As in the case of the Schwarzschild metric, as we formally cross the horizon into the 
region r > 1, t becomes a spatial coordinate and r a temporal coordinate. 

Interestingly, we can put a spherical mass or a black hole in de Sitter spacetime. Indeed, 
recall from exercise VI.3.6 that 

2M dr? 

ast=— (1-2) ay? 4 a (36) 
satisfies the Einstein field equation R,,, = +3Ag,,, outside the black hole, but not Ry,y,¢ = 
4 (8a8ve — Suc 8va) Of course, since this spacetime, known as the Schwarzschild-de Sit- 
ter spacetime (Sd5S* for short) is not maximally symmetric. 


Kruskal-Szekeres—like coordinates for de Sitter spacetime 


Starting with the de Sitter spacetime (35), we can go through steps similar to those we took 
in chapter VII.2 to obtain the Kruskal-Szekeres coordinates for the Schwarzschild black 
hole. Introduce xt =t +} log( 4), for 0 <r <1. We have dx* = dt + , and hence 


r ~ J=r2 
ds* = —(1—r?)dxtdx- + r?dQ? (37) 
where r is understood to be r(x*, x~) = a Also, 2t=xt+x7. 
Next, introduce U = e* and V =—e7*', so that UV = —e“ —*”) and hence 
pote (38) 
1-—UV 
Also, we have 
2t U 
ee 39 
V (39) 
Plugging dU = Udx~, dV = —Vdx*, and (38) into (37), we obtain 
2 1 240 
= —— (-4dUdV 1+ UV)*dQ 40 
s aw + (1+ UV) ) (40) 


Let us focus on the spacetime described by (40). In other words, now that we are 


done with x*, we forget about them. In figure 2, we show the salient features of this 
spacetime. The U and V axes are drawn at 45° from the vertical and divide spacetime 
into four regions labeled I, II, HI, and IV, as in the Schwarzschild case (discussed in 
chapter VII.2). According to (38), lines of constant r correspond to hyperbolas in the 
(U-V) plane. In particular, the north and south poles, both with r = 0, correspond to the 
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increasing t}| | 


U>0 
V<O0 


Yes f=0o 
t = —0o UV=+1 t = +00 
UV=+1 UV=-1 
north pole south pole 
r=0 r=0 
t = +00 t = —co 


Figure2 Kruskal-Szekeres-like coordinates for de Sitter spacetime. (a) The de Sitter horizon atr = 1corresponds 
to UV = 0, namely the U and V axes, drawn at 45° from the vertical. Spatial infinity r = oo corresponds to 
UV = 1. (b) The lines of constant t are shown. In region I, time points upward, that is, the Killing vector is future 
directed, but in region IV, the Killing vector is past directed. (c) The Penrose diagram for de Sitter spacetime. 


hyperbolas UV = —1 located in region IV and region I, respectively. The de Sitter horizon 
at r = 1corresponds to UV = 0, namely the U- and V-axes. Finally, spatial infinity r = co 
corresponds to UV = 1. 

Notice that the metric in these (U, V) coordinates depends only on the product UV and 
not on U and V separately. Referring back to (35) and (39), we see that this is of course why 
the (t, r) coordinates are called static coordinates in the first place: the metric in (35) does 
not depend on rt. Speaking in a fancier tongue, we would say that the spacetime admits a 
Killing vector 
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a 
ee a (41) 
af OUS:C:«O 


Explicitly, scaling U > e*U and V > e~*V in (39) oppositely generates time translation 
t—>t-+e, leaving r unchanged. In figure 2b, we show the lines of constant r, namely 
U =—e*V. In region I, with U > 0 and V <0, we see that with increasing f, the lines tilt 
upward, eventually ending up on the U-axis (namely the line V = 0). In other words, in 
region I, time points upward, or more accurately, the Killing vector is future directed. 

Note also from (39) that as t > 00, V > 0. So indeed, the U-axis corresponds to t = 00, 
in agreement with what we just said. Similarly, the V-axis corresponds to t = —oo. 

Now we see an interesting phenomenon: the lines of constant ¢ continue into region 
IV, with U <Oand V > 0. As t increases, the lines tilt downward. In region IV, the Killing 
vector is past directed. Of course, this just means that —t corresponds to time, and an 
observer in region IV would still move upward with the passage of proper time. 

In more mundane language, the Killing vector (41) is given in component form by €" = 
(U, —V, 0, 0). We find that 7 = g,,,€4E” = 2gyyéVEY = 8UV/(1— UV)’. As expected, 
in regions I and IV, €? < 0 and € is timelike. But in regions II (with U > 0, V > 0) and III 
(with U <0, V <0), € is spacelike. Actually, we already knew that the time coordinate t 
and the space coordinate r exchange roles as we cross the horizon, which in this diagram 
corresponds to the U- and V-axes. 

Finally, we can compactify and knead figure 2a into a square, that is, a Penrose diagram, 
as shown in figure 2c. 


Thermal radiation from the de Sitter horizon 


In 1976, Gibbons and Hawking showed that thermal radiation emanates from the de Sit- 
ter horizon, similar to the radiation emanating from the Schwarzschild horizon and to 
the radiation seen by an accelerated observer, discussed in chapter VII.3. The physics 
underlying each of these three cases is quite similar: quantum fluctuation produces a 
particle-antiparticle pair near the horizon (the Schwarzschild horizon, the Rindler hori- 
zon, and the de Sitter horizon, as the case may be), with one of them disappearing over 
the horizon, never to be seen by the observer. The other member of the pair is observed 
as thermal radiation. Indeed, we have already mentioned the striking similarity in form 
between de Sitter spacetime in static coordinates in (35) and the Schwarzschild metric, 
with the coordinate t changing from a temporal coordinate to a spatial coordinate. 

The temperature of the Gibbons-Hawking radiation can again be estimated by dimen- 
sional analysis: 
1 he 


1] eR oe 
de Sitter L L 


(42) 
A detailed quantum field theoretic analysis, well beyond® the scope of this book, is needed 
only to determine the overall numerical coefficient (which happens to be (27r)~'). 

A great deal of mystery lurks behind the Gibbons-Hawking radiation, however, even 


beyond the mysteries behind Hawking radiation. To start with, the de Sitter horizon is 
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observer dependent. For the black hole, we could invoke the possible microstates in its 
formation to account for the entropy. It is far from evident”? what the corresponding 
counting of microstates would be for de Sitter spacetime. 


Causal structure of de Sitter spacetime 


Faced with this almost bewildering variety of coordinates, we evidently should choose 
wisely, using coordinates appropriate for the physics at hand. For visualizing and calculat- 
ing geodesics, the embedding coordinates in figure 1 may actually be best (see appendix 
3). But to understand the causal structure of de Sitter spacetime, clearly some sort of con- 


formal coordinates (such as (26)) are best. Here we derive, by inserting cosh t = —— into 
(18), the metric in conformal coordinates (recall (10)): 
es a. 2\ 293 Bi 2 
ds*= (-ar? + 93) a (-ar? + dy? + sin? yao’) (43) 


Here t ranges from —4 to 5, causing f to range from —oo to oo, while the latitude y, 
as remarked earlier in connection with (10), ranges from 0 to z. Note that space, namely 
a constant t slice of spacetime, is not flat. In terms of the coordinates of the embedding 
spacetime M*:!, we have 


sin wy 
a 


T =tant, , 
COS T cos T 


(44) 


Consider the Penrose diagram in figure 3a. As depicted, the t axis is vertical, the y axis 
horizontal. Each point in this 2-dimensional (t, y) plot on a piece of paper represents a 
2-sphere. The left and right hand sides correspond to the north (yw = 0) and south (Wy = z) 
poles, respectively. Note that r = 0 for both the north and south poles, with W taking 
on opposite signs. The surfaces labeled Z~ and Z* sit in the infinite past and future, 
respectively. 

The attractive feature of a Penrose diagram is of course that light rays travel along lines 


at 45°. Indeed, in the d = 2 case, the trajectory a photon takes is given simply by dO = +dt. 

Imagine yourself sitting at the south pole (which is of course equivalent to any other 
point on the sphere). Where in spacetime can you send a message to? 

Clearly, you can send a message to any spacetime point in the shaded region labeled as 
O* (figure 3b). For example, you can send a light pulse to B from the point A on your 
worldline, as shown in figure 3b. Of course, before you reach A, you can also send a 
message traveling at less than the speed of light to B, but once you live past A, you can no 
longer send anything to B. 

Can the other observer, upon receipt of your message at B, send you a response? She 
cannot. Any response she sends will end up at Z*. 

Finally, you cannot send a message to the spacetime point C even if you had thought 
of it in your infinite past (namely, the lower right corner). A light pulse sent from the 
south pole in the infinite past would reach the north pole in the infinite future. However, 
if an observer stationed at C hurries, she could intercept your message upon crossing the 
diagonal line. 
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north pole 
south pole 


(d) 


Figure 3 Penrose diagrams showing the causal structure of de Sitter spacetime. (a) Each point in this 
2-dimensional (t, y) plot represents a 2-sphere. The left and right hand sides correspond to the north 
(y = 0) and south (y = z) poles, respectively. (b) Suppose you are sitting at the south pole (which is in fact 
anywhere). You can send a light pulse to B from the point A on your worldline, but once you live past A, 
you can no longer send anything to B. An observer stationed at C can intercept your message if she hurries 
across the diagonal line into OF. (c) From any point in O~, a message can be sent to you, but not from 
outside O-. (d) The region in spacetime you can communicate with is known as the (southern) causal 
diamond, shown as the shaded region. You can send a message to D and actually get a response back. 


Similarly, we could ask “From where in spacetime can a message be sent to you>” That 
region is shaded and labeled as O~ in figure 3c. For example, a message sent from the 
point C will end up in Z*. 

As many self-help books have assured us, communication is a two-way street. By this 
definition, the region in spacetime you can communicate with is given by the intersection 
of O* and O-, known to the cognoscenti as the (southern) causal diamond and shown as 
the shaded region in figure 3d (and referred to as region I in figure 2). For example, you 
can send a message to D and actually get a response back. 

In fact, we have already seen this causal structure of de Sitter spacetime by an explicit 
calculation in chapter V.3. In an exponentially expanding universe, everybody is moving 
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away from us and will eventually move past our horizon forever, just as distant ships move 
over our everyday horizon. For example, unless the observer at D fires up her rocketship 
and hurries, she will eventually cross the 45° line in figure 3d and leave the southern causal 
diamond, headed toward the infinite future Zt, as indicated by the solid curved line. If 
she really hurries, she could still meet you, but not until after you live past the point A’. If 
all goes well, she could then merge her worldline with yours, and you could travel to T+ 
together. 


Iterative relationship between de Sitter spacetimes 


All the way back in chapter I.5 you showed in exercise I.5.10 that the metrics on the 
spheres S@ enjoy an iterative relation between them: ds7 = d6? + sin’ 6ds7_,. (See also 
(10).) Not surprisingly, the metrics on de Sitter spacetimes dS“ also enjoy an iterative 
relation between them, since they are Minkowskian spheres. 

Recall the angular coordinates you worked out (exercise 1.5.9) on the sphere S*: 


X'=cos@,, X*=sin0, cos, 


xl 


X4 = sin 0, +++ sin @7_1 cos 6,7, = sin 0, ---sin @j_1 sin 07 (45) 


As already alluded to earlier in this chapter, and as explained in chapter III.3, we can jump 
immediately from the sphere to de Sitter spacetime by making Minkowski’s “mystical” 
substitution X“+! = i X°, so that (1) becomes (2). Calling 6, for brevity 6, we write* y = i6 
formally (so that cos 6 = cosh g and sin @ = i sinh ¢) and obtain 


X'=cos@,, X*=sin 0, cos, 
X4—sin6,---sin6,_,coshg, X°=sin6,---sin6,_,sinhg (46) 


Replacing d67 in the spherical metric 


ds? = d0? + sin? 0,d0} + +--+ sin® 0, ---sin® 0,_,d07 


a 
by —dg?, we obtain, for the de Sitter metric, 

ds’ = d0; + sin? 6,d0} + ---— sin? 6, +++ sin” 0_,dy* 
Thus, we arrive at the iterative relationship for dS: 

ds, =d0? + sin’ 0ds7_, (47) 


which is formally precisely the same as the iterative relationship for S“. 

As remarked earlier in chapter I.5, this iterative relation just expresses the fact, known to 
many school children, that lines of constant latitude on the globe form circles. Similarly, 
you can see from figure 1 that if you slice the hyperboloid representing dS“, you get a 
hyperboloid of one lower dimension. 


* More precisely, we analytically continue. In quantum field theory, this is known as a Wick rotation. 
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I might mention here that, while we went, at the start of this chapter, from a sphere to 
a de Sitter spacetime by a feat of imaginary magic, we can certainly go from a de Sitter 
spacetime back to a sphere using the same trick. Letting X° = iX° while keeping X', X* 
fixed, we can turn a set of coordinates on dS* into a set of coordinates on $*. For example, 
let t > it (so that sinh t > i sin t and cosh t — cos ft), then the coordinates in (32) for the 
closed cosmological constant-dominated universe become coordinates for S*. In doing 
this, we have to make sure that X' and X* remain the same, of course. Thus, for example, 
for the coordinates in (33) for the open cosmological constant-dominated universe, we 
have to let r > ir as well as t > it. As another example, letting t > it and y > iy in 


the metric ds? = — na?? + 1?(dy? + sinh” dé?) for dS? mentioned after (11), we 
recover the metric ds? = nar? + r?dQ} for S3 mentioned just before (10). 


Stereographic projection for de Sitter spacetime 


Just as we can stereographically project the sphere (surely you did exercise 1.5.13), we 
can stereographically project de Sitter spacetime by mapping (X°, X!, X?, X3, X*) into 
(x°, x1, x, x3) as follows (here we reinstate L): 


1 


xM — ——_3Mx#,  M=0,1,2,3 (48) 
1+ 
and 
4 1-45 
Xt=aL L (49) 
1444 


where x? = —(x°)? + (x!)? + (x2)? + (x3)?. The Kronecker delta in (48) emphasizes that 
while the indices M and w might be numerically identical, they are not conceptually the 
same. In other words, (48) says that, for example, X? and x? differ by the overall factor 


x2 2 = x2 2 
(1+ a) (14 a) 
You can now analytically continue the result from chapter 1.5 on stereographic projection 
of the sphere to verify that the defining relation (2) is satisfied and that 


2 
iD 1 way 
ds* = 7 Nuva x" dx (50) 
14+- 


42 


(Of course, you can also check this by brute force, plugging (48) and (49) into ds* = 
nundX“dX".) This shows explicitly that de Sitter spacetime, aka the Minkowskian 
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sphere, is conformally flat in any dimension. Upon recalling that the garden variety sphere 
is also conformally flat,* we are perhaps not surprised. 


The rise of de Sitter spacetime 


De Sitter and anti de Sitter spacetimes have become the darlings of theoretical physicists 
for entirely different reasons. We will talk about anti de Sitter spacetime in the following 
chapter. As already mentioned on several occasions, observational cosmologists tell us that 
our universe is partly filled by a dark energy, presumably the cosmological constant. Thus, 
as we discussed in chapter VIII.2, our universe will eventually expand into a de Sitter 
spacetime. To me, it is strangely appealing that, having discovered that our world is 
Euclidean round, we now realize that our universe will become Minkowskian round. 
Allow me to go into a bit more detail about the history of the de Sitter metric, first 
mentioned in chapter V.3 (see table for de Sitter spacetime). In 1917 de Sitter found that® 


ds* = — cos* xdt? + (ax? + sin? xd9}) (51) 


satisfies (4) and hence solves Einstein’s field equation with a positive cosmological con- 
stant. The attentive reader recognizes that this metric is just the static metric (35) with the 
simple transformation r = sin x. In 1922, Lanczos and Weyl independently and correctly 
wrote down ds? = —dt? + cosh? t(dgy? + cos? yd? + cos’ y cos? wdw2), which again the 
attentive reader recognizes as the global metric (18). Then, in 1925, Lemaitre, while still a 
student, noted that de Sitter’s coordinates were not comoving, namely that lines of constant 
x, 6, and y were not geodesics unless x = 0 (hence the corresponding “defect” plagues (35) 
also), and so de Sitter’s coordinate choice was not homogeneous. Lemaitre then discovered 
the metric (23) for the flat expanding universe that we have known and loved since chapter 
V.3. Later, in 1927, he extended his work to include closed and open universes. Perhaps 
with some justification, we should refer to the spacetime studied in this chapter as the 
de Sitter-Lanczos-Weyl-Lemaitre spacetime. 


Appendix 1: The group theory behind the exponentially expanding universe 


As promised, I now give a more satisfying motivation? for the coordinate transformation given in (22). For this 
discussion, we will be a bit more general and discuss dS¢ = SO(d, 1)/SO(d — 1, 1). With L = 1, (2) reads 


sheets ge ae 2 
mun X¥XN = — (x°) +0 (x) 4+-(xe) 51 (52) 
i=1 


As always, we have the Lie algebra of SO(d, 1) (see (III.3.21); M, N, P, and Q range over 0, 1, 2,---, d): 


[Jun> Jpo] =i (nupJvo + MNoJup — INPIMo — NuoJNe) (53) 


* At this stage, I need hardly remind you that conformally flat is not flat! 
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We now identify the 3d (d + 1) generators SO(d, 1) of acting on dS“. In addition to the generators of rotation 
Jij (here i and j range over 1, 2, ---, d — 1), we have the combinations 


Pi=Jint+ Jai, D=Jao Ki =Sio- Jai (54) 


which we identify as the generators of translation, dilation, and conformal transformation, respectively, as 
discussed in the preceding chapter. As the context is slightly different (and because of our inclusion of factors of 
i here), we again display the algebra, which you can deduce from (53): 


[F,, Pl=0, (Kj, K=0, 
[D, P;,]=—iP., [D, Ji]=9, [D, K,|=iK;, 

Jij> l= IGP —8eP)» ij Kil =e Kj — 82K), 

[Pj, K |= 218;;D — 2iJj; (55) 


For example, [P;, P;]=[Jio + Ja,i, Jjo + Ja, J=i(—Jij + Jij — 63; D + 6;;D) = 0. Note that to obtain this familiar 
result, we have to define translation P, as a linear combination of a boost in the ith direction and a rotation in 
the (d-i) plane. As another example, [D, P;]=[Ja,o, Jio + Ja,i] = —i(Ja,i — Joi) = —i P;- 

These generators act linearly on the embedding coordinates X™. As in chapter III.3, their action can be 
represented by Jy =i(X yy — Xy9y). Thus, each of these generators is represented by a (d + 1)-by-(d + 1) 


matrix. We arrange the indices in the “natural” order (0, {i}, d) = (0, 1, 2,---,d—1,d) where, as indicated 
above, i ranges over 1, 2,---, d — 1. For example, the boost D = Jy 9 in the dth direction is 
0 0 | -1 
p=i| 0 fo] 0 (56) 
-1]0 0 


The notation is such that along the diagonal, in the upper left, the 0 represents a 1-by-1 matrix with its entry equal 
to 0; in the center, the 0 represents a (d — 1)-by-(d — 1) matrix with all its entries equal to 0; and finally, in the 
lower right, the 0 once again represents a 1-by-1 matrix with its entry equal to 0. Exponentiating the generator D 
to obtain the group element, we obtain 


cosht | 0 | sinht 
e Dt = 0 I 0 (5 7) 
sinht | 0 | cosht 


In other words, this is a (d + 1)-by-(d + 1) matrix with a (d — 1)-by-(d — 1) identity matrix in its center. 
Similarly, we have 


o| x? | 0 
P.tai] %] 0 [Fz (58) 
0 | -x7 | 0 


In the matrix, X is to be interpreted as a (d — 1)-dimensional column vector (so that x” is an (d — 1)-dimensional 
row vector). (Notice that as a linear combination of a boost and a rotation, P - x is symmetric in its upper left 


corner and antisymmetric in its lower right corner, so to speak.) Exponentiating, you will find 
jee tae | 98), a2 
Pia] I x (59) 
—$x? | —x7 | 1-45? 
BS 


Notice that (P - ¥)3 = 0, so that the exponential series terminates. You are invited to verify that e! PX gi 
ei PETS), 

Just as in chapter IX.6, we map the coset manifold dS4 = SO(d, 1)/SO(d — 1, 1) by acting with g(t, x) = 
exp(i P . x) exp(iDt) on a reference point, which we choose to be X,, = (0, 0, 1) (in analogy to the south pole 
for the familiar case of the sphere). (The logic here is that X, = (0, 0, 1) is left invariant by the SO(d — 1, 1) 
generated by J;;, Jig = P; + K;.) We obtain 


X =(¢X,) =e? Fi, 0, 1) = (sinh + + ez, ef #, cosh — et?) (60) 
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We recognize that this is precisely what appears as the rather peculiar coordinatization (22) we encountered 
before, but now derived group theoretically. From (60), we obtain 


ds? =nyydX™"dX™% = —dt? + e*dx? (61) 


namely the metric for the expanding universe (23). In other words, to describe the expanding universe in the 
form (61), we coordinatize an event at (t, x) by the group element g(t, X) = exp(i P - X) exp(i Dt) needed to bring 
the reference point X,, to our event. 


Appendix 2: Discovering the expanding universe without knowing 
about Einstein’s field equation 


While we are all enamored of the beauty of group theory, the truly impoverished man ignorant of this wonderful 
subject can still obtain the coordinate transformation in (22) by brute force (of course). Starting with ds? = 
dT? +dX?+dY*+dZ* + dW’, we transform T = f(t,r), X =xh(t,r), Y=yh(t,r), Z=zh(t,r), W= 
g(t, r) with r? = x? + y? + z* and try to get to ds? = —dt* + a*(t)dx2, with a(t) some unknown function. 
The embedding equation —T* + X* + ¥2+ Z? + W?= 1 (set L = 1 for convenience) gives — f* + g?+r7h? = 
1. Plugging (as usual, f = a f= uss and so on) dT = fat + fdr, dX =hdx + xhdt + xh'dr, and so forth, 
into ds? and matching to the desired form gives us (a) — f* + g2 + r2h? = -1, (b) —f f’ +. gg’ + r2hh' + rhh=0, 
(c) h? = a?, and (d) f” = g?. This is straightforward to solve. For instance, (d) gives (with no loss of generality 
by flipping either T or W) f(t, r) = g(t, r) + k(t), with k(t) some unknown function. Eventually, we arrive at, 
as a bonus, a(t) = e’ after absorbing an integration constant. 
The point of this little exercise is to show that mathematical types thinking about the analogs of spheres for 


Minkowski spacetime could have, in principle, discovered the exponentially expanding universe long ago without 
knowing about Einstein’s field equation. This suggests another extragalactic fable. We can imagine a smart guy 
in a civilization infatuated with the sphere arriving at the de Sitter universe, and then, by calculating the Ricci 
tensor (or knowing that the analog of the sphere is maximally symmetric), uncovering the cosmological constant 
and dark energy. 


Appendix 3: Geodesics in the embedding space 


Let us determine the geodesics in de Sitter spacetime using the embedding coordinates X™ satisfying X? = 
nunX™ XN = 1. Instead of extremizing the integral [ /—nyydX“dXN, we extremize [ dt(5X? + FAX? - 
1)), as explained in exercise II.2.6, imposing the constraint with the Lagrange multiplier A. Here ¢ is an 
appropriate parameter and X” = dX™ /de. 

The reader may or may not recognize this as essentially the same problem of a particle on a sphere that we did 
back in appendix 4 to chapter II.3. We can lift many of the equations, suitably reinterpreted with a Minkowski 
metric, from that simple problem! The equation of motion reads X” = AX™, to be solved with the constraint 
X2 = 1, which, when differentiated, gives X - X=0 (where we indicate the dot in the dot product to remind us 
that we are dealing with vectors). Using the equation of motion, we also have ¥ - X = 0, thus concluding that X? 
is a constant. As in chapter II.3, we verify by direct differentiation that J@” = x“ Xx" — x" X™ is conserved. 
Define 2)? = JM" Jy = 2(X! XN — XNX™)Xy,Xy = 2X2. (Note that both signs are possible for J?.) Hence 
we have X? = J?. 

For definiteness, let us consider a timelike geodesic. Since X= J is negative, write K 2—~_ J? with K real. 
Then this last equation has the obvious solution X M — gMeKo 4 pMe-Ko witha and b two constant real vectors. 
The constraint 1 = X* = a2e?K* 4 be? 4 2a - bimplies that a* = b? = 0, 2a - b = 1. The motion of the particle 
is completely solved: 


X =ae*! + beW*S, with a2 =0=b%, and 2a-b=1 (62) 


Geometrically, the particle travels along the hyperbolic version of great circles on the surface defined by (2). You 
can verify this by following steps analogous to those given in chapter II.3. For a spacelike geodesic, J? is positive. 
Evidently, we replace e*** by cos J¢ and sin J¢. 


646 | IX. Aspects of Gravity 


For a lightlike geodesic, J = 0, and the equation of motion collapses to X = 0 with the solution X¥ =a+ be, 
with a and b two Minkowski vectors. In the embedding space, light travels along a straight line. The lightlike 
condition (dX)* = 0 implies that b? = 0. The condition that the photon stays on the de Sitter hyperboloid (2) 
implies X? = a? + 2a -b ¢ =1. Thus, lightlike geodesics are determined by 


X=a+be,  witha?=1,b?=0, anda-b=0 (63) 


Confusio looks puzzled for a moment, muttering “How can (dX )2 = 0 and X?2 = 1 both be satisfied?” Yes, 
they can. 

It is fun to verify that these geodesics are indeed followed in any specific coordinate system we use to map out 
de Sitter spacetime. Let us pick, for example, the expanding universe coordinates in (23), since the cosmologists 
like them. 

Consider a cosmologist at rest at r = 0 (which is in fact anywhere in the universe) tracing a perfectly respectable 
timelike geodesic. Inspecting (22), we have (setting L = 1) X = (sinh r, 0, 0, 0, — cosh r) = ze(1, 0,0, 0, —1) — 
fe"(1, 0, 0, 0, 1), in agreement with (62) with ¢ =r, K =1. Thus, a= 3, 0,0,0,-1),b= —3(1, 0, 0,0, 1). 
Indeed, a2 = 0 = b? and 2a -b =1. 

Next, consider a photon moving along the x-axis, starting atx = 0, t = fg. (Note that the expanding universe is 
not invariant in t, and recall that we studied this problem in chapter V.3.) Then dx = e~‘dt, so thatx = eS —e', 
with the photon reaching x(t = 00) = e~'S at t = 00. We recover the de Sitter horizon. Now (22) gives T = 
X° =sinhr 4 z (e's eye = 31 te s)e! — e's W= X*= 51 e~*!s)e! 4+ eS, and X = X1= (es — 
e~')e! = e's — 1, Indeed, (63) is satisfied with a = (—e~'s, -1, 0, 0, eS), b= (4(1 +e), e-8, 0, 0, 5(1- 
es), and ¢ =e'. 

Confusio is amazed, but you know that we are merely checking that the defining equation (2) is satisfied and 
that, if ds? = 0 in one set of coordinates, ds? = 0 in any other set of coordinates. 

As yet another example, follow a photon starting at r = 0 and t = 0 in static coordinates (35). Integrating with 
dt =dr/(1— r*) with these initial conditions, we obtain r = tanh t. As expected, it reaches the horizon r = 1 at 
t = 00. We check readily that (63) holds with a = (0, 0, 1) and b = (1, 1, 0) (with a minor abuse of notation). 


Appendix 4: Space of spheres and de Sitter spacetime 


In this appendix, we present an amusing tidbit regarding the space of spheres and de Sitter spacetime.!” 


Consider the space of spheres living in ordinary 3-dimensional Euclidean space. We need 3 numbers x = 
(x, y, z) to specify the location of the sphere and a number R to specify its radius. Thus, the space of spheres is 
4-dimensional. Any guesses on what this 4-dimensional space is? Of course, the way I have set you up, and the 
mere fact that this appendix is in a chapter on de Sitter spacetime. you might suspect that it is dS*. But as you 
will see, it is quite remarkable how the connection works. 

Picture two spheres, one with radius R located at x, the other with radius R’ located at x’. See figure 4. Suppose 
the two spheres intersect. The intersection is a circle perpendicular to x — x’. Pick any point V on this circle. (By 
rotational invariance, it will be clear that for our purposes, it does not matter which point we pick.) Consider the 
triangle formed by the centers C and C’ of the two spheres and V. Denote the angle CVC’ by &. (If you prefer, 
you can talk about the 3-dimensional space of circles, which is slightly easier to visualize. The intersection of two 
circles consists of two points, and we could pick either as V.) Then (x — x’)? = R? + R? — 2RR’ cos Q. 

Let us associate the two spheres with two points X and X’ on the hyperboloid that defines dS*. We will work 
out the association as we go along. In the embedding 5-dimensional Minkowski spacetime, the distance between 
the two points is given by (X — X’)? = 2(1— X - X’), where X - X’=— 5(XtX' + X-X 4) + X-X’ in light cone 
coordinates, with the dot product defined by the 5-dimensional Minkowski metric, of course. We can evaluate 
X - X’ in any of the coordinate systems listed in this chapter, but with malice of hindsight, let us use the flat 
expanding universe coordinates given in (25): 


2X -X'’=— (etre = )) -(< J+ 2et el x - x! 


/ ! 


gt ag het tet (64) 


FSD Z , 
But in the space of spheres, we have 2 cos Q = £ + = ee R a , aS mentioned above. Thus, if we make the 


association R = e~' and R’ = e~"’, we see that cos Q= X- X’. 
For example, the condition that two spheres barely touch, namely cos Q = —1, translates into the antipodal 
condition X - X’ = —1. We can now translate between the two mathematical constructs. As another example, 
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Diane 
\/ 


Figure 4 Two spheres intersecting, with the 
relation between various lengths and the angle 
Q fixed by elementary trigonometry. 


the invariant volume of spacetime d*x./—g = d>xdte* = d3xdR/R* gets translated into a measure that severely 
suppresses large spheres. 


I find it quite surprising that the 19th century space of spheres somehow “knows” about the 20th century 


flat expanding universe, not to mention the Minkowski metric. Sitting in a flat expanding universe, we are each 
associated with a sphere whose radius shrinks with the inexorable passage of time. 


Exercises 


Using the “slicing” in (22), derive the standard form (23) of the exponentially expanding universe. 


Verify for the various metrics of de Sitter spacetime the maximal symmetry relations (for d = 4) Ruy = 
+75 (8yr8v0 ~~ Bua 8vr) Ruy = +258uvs and R = +4. 


Starting from the static coordinates (35), you can obtain the de Sitter analog of the coordinates we used for 
spherical black holes in chapter VII.2 by defining dp = dt 4 “ and dq = —dt 4 “ z- Show that 


= 


ds? = (1 - #7) dpdq + dQ, (65) 


Show that the metric (36), with aM replaced by aM, also solves the vacuum Einstein equation in d 
dimensions. 


Show that in conformal coordinates, the event horizons of an observer at the south pole and of an observer 
at the north pole are given in terms of the embedding coordinates by T + W = 0, respectively, and r = 1. 


Complete the brute force calculation in appendix 2. 


Show that lines of constant x, 0, and g in de Sitter’s original metric (51) are not geodesics unless x = 0. 


Notes 


1. A useful reference on de Sitter and anti de Sitter spacetimes is M. Spradlin, A. Strominger, and A. Volovich, 
Proceedings of the LXXVI Les Houches School, 2001. 

2. Our treatment is resolutely from the physicist’s point of view. For an entry to the mathematical literature, a 
starting point might be “A Geometrical Background for De Sitter’s World” by H. S. M. Coxeter Am. Math. 
Monthly 50 (1943), pp. 217-228. 
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. The isometry groups should be, strictly speaking, O(d + 1) and O(d, 1), respectively, but we won't be talking 


much about reflections. 


. [like this partly because I learned it from A. Einstein's The Meaning of Relativity—directly from the master, 


so to speak. 


. I was motivated to display this many forms of the de Sitter metric when I participated in various workshops 


and schools and realized that many in the audience were unaware of the variety of ways in which de Sitter 
and anti de Sitter spacetimes could be written. 


. Actually, it requires only a minimal amount of quantum field theory; I am almost tempted to devote an 


appendix to it. The clearest derivation I know of is in section III.2 of the article by M. Spradlin et al. Let 
me mention one key feature: for the observer sitting at r = 0, the passage of proper time between two 
events is measured by X - X’= — sinht sinh t’ + cosh t cosh t’ = cosh(t — t’), but the function cosh(Ar) = 
cosh(At + 2zri) is periodic in imaginary time. Thus, we can again invoke the “mystical argument” of time 
as an angle mentioned in appendix 1 of chapter VII.3. 


. People have argued that our lack of knowledge of what happens beyond the horizon amounts to a kind of 


entropy. 


. W. de Sitter, Proc. Royal Acad. Amsterdam, XIX (1917), p. 1217. De Sitter’s motivation seems somewhat 


muddled to modern eyes. He began by saying that Einstein had proposed the boundary condition (go) = 
—1, gj; = 0) at infinity (which is clearly not invariant under coordinate transformation) and proposed instead 
that g,,, = 0 at infinity. Interestingly, he stated in a footnote, “The idea to make the 4-dimensional world 
spherical in order to avoid the necessity of assigning boundary-conditions, was suggested several months 
ago by Prof. Ehrenfest, in a conversation with the writer. It was, however, at that time not further developed.” 
Also, in a postscript, de Sitter said that he communicated his result to Einstein, who wrote back objecting to 
a universe without matter. I am grateful to Gary Gibbons for showing me this paper. 


. I learned this from a paper by S. Deser and A. Waldron. 
10. 


I am grateful to Gary Gibbons for telling me about this interesting connection between the space of spheres, 
which was developed in 19th century mathematics, and de Sitter spacetime. 
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A container for gravity 


Now that you have mastered de Sitter spacetime, you are ready to tackle anti de Sitter 
spacetime. Incidentally, this antiterminology appears to be modern, as all these spacetimes 
were referred to as de Sitter in the older literature. 

As you well know, in theoretical physics, it is often useful to enclose in a box the sys- 
tem we want to study, be it the electromagnetic field or a quantum particle. Unfortunately, 
there is no known material out of which we can construct a box to contain the gravitational 
field. However, as you will learn in this chapter, anti de Sitter spacetime possesses a spatial 
boundary consisting of a Minkowskian spacetime of one lower dimension. For example, 
the 5-dimensional anti de Sitter spacetime has as boundary the 4-dimensional M*! space- 
time.” This striking feature prompts us to use anti de Sitter spacetime as a container? (the 
term tin can is sometimes used) for quantum gravity, the only way we know to confine the 
gravitational field and study its properties. 

Interest in anti de Sitter spacetime has exploded* in recent years due to the amazing 
discovery by string theorists, notably Maldacena and others, that the physics of various 
theories of gravity in AdS° can be mapped onto the physics of certain gauge theories on 
the M?:! spacetime that forms the boundary of the AdS° spacetime. That this is even con- 
ceivable is intimately connected to the holographic principle mentioned in chapter VII.3. 
This surprising correspondence, known as* AdS/CFT correspondence, or more accurately, 
as the gauge/gravity duality, promises to shed light® on both quantum gravity and strongly 
coupled gauge theories. More recently, a great deal of excitement has also been generated 
by the possible relevance of this correspondence to condensed matter physics. 


* AdS stands for anti de Sitter and CFT for conformal field theories: some special gauge theories are conformal 
invariant, hence the term. 
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Figure 1 The d-dimensional anti de Sitter space- 
time AdS¢ embedded in (d + 1)-dimensional 
Minkowski-type spacetime M¢-1 2, 


Anti de Sitter spacetime 


The d-dimensional anti de Sitter spacetime Ad S¢ with length scale L is defined as the set of 
all points (X°, X1, X*,---, X“)in (d + 1)-dimensional Minkowski-type spacetime M412 
(that is, a spacetime with ds? = —(dX°)* + (dX)? + (dx)? +. --- + (dX4-1)? — (dx4* 
satisfying —(X°)? + (X12 + (X2)2 + --- + (X4~})? — (X4)? = —L4, which we write as 


(x°)’ = 3 (x')’ + (a), =L* (anti de Sitter spacetime) (1) 
i=1 


as shown in figure 1. 
Compare and contrast this embedding equation with the one for de Sitter spacetime 
given in (1X.10.2), which I display here again for convenience: 


=(<) 4 y (x')'+(x4)'=2? (de sitter spacetime ds) ?) 
i=1 


In parallel with the discussion for de Sitter spacetime, we know that anti de Sitter 
spacetime is also maximally symmetric. 

Note that the isometry group for anti de Sitter spacetime is SO(d — 1, 2) rather than 
SO(d, 1). Aside from this, much of the discussion for de Sitter spacetime could now be 
repeated. In particular, like de Sitter spacetime, anti de Sitter spacetime is also maximally 
symmetric. A point on anti de Sitter spacetime, (0, --- , 0, 1) for example, is left invariant 
by SO(d — 2, 2). In other words, AdS“ is the coset manifold SO(d — 1, 2)/SO(d —2, 2). 
For example, AdS? = SO(4, 2)/SO(4, 1). 

The isometry groups of de Sitter and anti de Sitter spacetimes are contrasted in this 
table: 


AdS*|| SO(d—1,2) | 


ds¢ SO(d, 1) 
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The signs in (1), in contrast to those in (2), make all the difference in the world! Some 
authors treat de Sitter and anti de Sitter together by introducing the sign o = +1 for dS 
and o = —1 for AdS, so that the embedding equations (1) and (I[X.10.2) are unified into 
(ea? tee a o(X4)? =oL?, with w, v=0,1,---,d —1. In general, I won't. I find the 
practice confusing, not worth saving some space in the exposition, but occasionally, it 
is instructive to compare the two spacetimes side by side, as I will do presently. 

In the preceding chapter, we coordinatized de Sitter spacetime by eliminating W = X“. 
We can do the same for anti de Sitter spacetime; indeed, it is illuminating to do them 
side by side, as I just outlined. With W* = L*—oX - X (the notation is self-evident: 
X -X =n,,X"X"), we have WdW = —oX -dX and dW* = (X -dX)*/(L* — 0X - X). 
Hence, we obtain 


x*xe 
2 HqxX” 2_ _ Muatp® AT Hq 
ds? = nyydX"dX” + odW? = (nw ies) dX"dX (3) 
as in the preceding chapter, but now with o = +. Thus, with the metric of AdS¢ written in 


terms of d-dimensional coordinates, we can go back and forth between de Sitter and anti 
de Sitter spacetimes by formally letting L* > —L?. 
In coordinates for which g,,, is dimensionless, we thus have immediately 


1 
Ruviro = “72 (8ur8v0 -s So 8va) (4) 
and 
(d — 1) d(d — 1) (d — 1)(d — 2) 
Ruy = Sy 8h R=- 7 Eww = 372 Saw (5) 


These expressions differ from the corresponding expressions (IX.10.4 and IX.10.5) for 
de Sitter spacetime by an overall sign, and so anti de Sitter spacetime solves Einstein's field 
equation R,,, = 87GAg,,, with a negative cosmological constant given by 87 GA = — a: I 
remind you that, as explained in the preceding chapter, the observed cosmological constant 
is positive, leading to a de Sitter spacetime. 

Again, we can calculate the Riemann curvature tensor, as in the preceding chapter, by 
letting X — 0 in (3), so that g,.~ (My) + FrNyaNypX*X?). The metric is locally flat at 
X* = 0. Referring to (VI.1.14), we obtain Rypyy = 73 (MrpNpv — NrvNpy)- Promoting n,,, to 
Sy, we obtain R,,,,, at an arbitrary point. Setting o = +1 for dS and o = —1 for AdS, we 
obtain (IX.10.1) and (4), respectively. 

For pedagogical clarity, instead of writing everything in terms of an arbitrary d, I will 
often specialize to whatever value of d suits my purpose best. I often call X° = T, and 
X¢ = W. Here is a table comparing and contrasting Ad S* and ds*: 


AdS* || —T?4+X*+Y24 7?-W2=-L? ds? = —dT? + dX* +. dY?4+dZ? — dw? 


ds* 774 X24 724 774 W2=L? ds? = —dT? + dX* 4+. dY?+dZ?+ dW? 


That the isometry group is SO (3, 2) for AdS* but is SO(4, 1) for dS* should now be as 
clear as day. 
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Two “times”? 


Note that in (1), the embedding is not into the familiar Minkowski spacetime M@:!, but 
into M¢—1.2 with two time coordinates. What do we do with two “times”? Very strange® 
indeed! 

Physics as we know it does not admit two times.’ Set d = 3 for definiteness. Let us 
understand how Ad? ends up having only one time coordinate, even though it started 
with two. According to (1), AdS? is defined by 


(72+ w?) — (x74 y?) =L? (6) 


with a metric induced from the two-time Minkowski metric 7 = diag(—1, +1, +1, —D) of 
the embedding space M*:?, namely 


ds? =— (dT? + aw?) + (dx? + ay?) (7) 


As shown in figure 1, we may picture AdS? as embedded in M*:*, somewhat crudely, 
as a cylindrical tube with a radius that increases with increasing X and Y. In the (T, X) 
plane with W = 0 = Y, the defining equation traces out two hyperbolas T = +/L? + X2. 
Compared to the tube shown in figure IX.10.1, the tube here is lying on its side, so to 


speak. 

I have grouped the coordinates in (6) and (7) to render the isometry group SO(2, 2) 
manifest. Replace the two time coordinates (T, W) by polar coordinates (R, t) by setting 
T =Rcostand W = R sint. Similarly, replace the two space coordinates (X, Y) by polar 
coordinates (r, @) by setting X =r cos @ and Y =r sin 6. Then ds? = —(dR* + R*dt?) + 
(dr? + r2d6?). The apparent difficulty is that we have two time coordinates (R, f). 

I have dragged L around long enough. Just as in the preceding chapter, for ease of 
writing, 1 am now going to unceremoniously set L = 1. The resolution of the puzzle of 
two times is that the apparent temporal coordinate R is not independent of the spatial 
coordinate r, since the defining equation R2 — r* = 1 constrains them. Differentiating, we 
obtain RdR = rdr and hence dR? — dr? = (S —Ddr2= — phar’. Thus, we obtain 


2 2\ 4.2, ar? 2 492 

ds? =— (147?) dP? + +770 (8) 
1+r? 

We end up with only one time coordinate! 


The metric (8) describes a manifestly respectable spacetime. Note also that the metric 
does not depend on time and hence these coordinates are known as static. They are 


defined by 
T=V1+r cost, W=V1+r?sint, X=rcosé6, Y=rsin@g (9) 


Did you watch the magician carefully? How did one of the two time coordinates dis- 
appear, leaving us with only one time coordinate rt? Even better, you should have done 
the calculation. Okay, the secret, so to speak, behind the two-timing M?:? becoming a 
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respectable spacetime is that the spatial coordinate r is more spatial than the temporal 
coordinate R is temporal, in the sense that R* > r’, so that dR? < dr?. 

Incidentally, something similar occurred back in chapter I.6. in the construction of 
hyperbolic spaces. They were embedded into what at that stage of the book we called 
“pseudo-Euclidean” spaces, but in fact, the hyperbolic spaces turned out to be perfectly 
Euclidean. 

We can of course immediately jump to Ad S“ by replacing in the preceding discussion 
X? + ¥? by ea and so on and so forth, to obtain ds* = —(1+ r2)dt? + ae + 
r7dQ?_,. Just as in the de Sitter case (see (IX.10.35)), the metric has the same form ds? = 
—f(r)dt? + f(r) dr? + dO, as the Schwarzschild metric. But instead of f(r) = 
1— r*in the de Sitter case, we now have f(r) = 1+ r? > 0, and thus anti de Sitter spacetime 
does not have a horizon. 


Time to unwind time! 


I offered a word of caution about figure IX.10.1; so two words of double caution about 
figure 1. It is hard enough to have spatial intuition about M?'!, let alone M?-?. 

Indeed, figure 1 indicates that there are closed timelike curves, which would also 
threaten physics as we know it. We changed variables by T = R cost and W = R sint, and 
so t started out as a periodic variable. However, in the spacetime defined by the resulting 
metric (8), the time coordinate t flows majestically from —oo to +00. For us physicists, 
then, we simply define anti de Sitter spacetimes by the metric in (8). Mathematicians would 
say that physicists have gone to the universal cover. Colloquially, just between you and me, 
we could say that we have unwrapped the circular time coordinates. Picture a roll of paper 
towels being unrolled into a long rectangular strip. Since the roll is not cylindrical, as 
shown in figure 1, the resulting strip of paper cannot be laid down flat, which is precisely 
what those factors of (1+ r7) in (8) are telling us. 

The isometry group for AdS“, SO(d — 1, 2), contains SO(d — 1) x SO(2) as its max- 
imal compact subgroup. Referring to (1), we see that, evidently, SO(d — 1) rotates 
(X!, X2,---, X4-1), while SO(2) rotates (X°, X“). In other words, SO(2) rotates T = 
R cost and W = R sin t into each other, thus translating t > t + constant. 


A hyperbolic radial coordinate 
Those readers adept at doing integrals will recognize that the appearance of 1 + r? in (8) 
is begging us to change variable to r = sinh p. Doing so gives 

ds* = — cosh” p dt? + dp* + sinh’ pdQ?_, (10) 


Recall that dH?_, = dp? + sinh” pdQ?_,. With these coordinates, space is hyperbolic. 
The original embedding coordinates are given by 


T =cosh p cost, W=coshpsint, X=sinhpcosé, Y =sinh p sin @ (11) 
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(for d = 3). Note that since r ranges from 0 to +00, p also ranges from 0 to +00. As usual, 
the angular coordinates just go along for the ride. 


Conformal coordinates for anti de Sitter spacetime 


Like de Sitter spacetime, anti de Sitter spacetime wears many disguises, as already indi- 
cated by (8) and (10). 

In parallel with the preceding chapter, I list the many faces of anti de Sitter spacetime 
in a table near the end of this chapter. Let us now find the conformal coordinates for anti 
de Sitter spacetime, namely the analog of ([X.10.43). Start with (8), set r=tan wy, and 
behold: 


ds* = 


1 
_APr. 2 42 2 — + (_q;2 2 
err dt*“+dw*+sin v doi, ») re dt +40,_,) (12) 


Of course, we could also have started with (10) and set sinh p = tan w and cosh p = 
(cos y)~ 1. Compare and contrast with the de Sitter spacetime in (IX.10.43). 

The time coordinate ¢ runs from —oo to +00 (as it did in (10)). Thus, the conformal 
diagram, as shown in figure 2, is a strip extending to infinity in the time direction. As 


usual, light travels at +45° to the vertical. Note that each point in this 2-dimensional plot 
describes a sphere S¢~? of radius sin y, which varies from 0 to 1 (also see below). 

We see that AdS¢ is conformally equivalent to a spacetime with ds* = —dt* + dQ?._,. 
Recall from chapter VII.2 that the Minkowski spacetime M¢—'1 is conformally equivalent 
to the same spacetime, topologically a cylinder for d > 2. We will see in appendix 3 that 
AdS? is special. 


t= 
t 
t=0 
Z 0 
y=0 w= 


Figure 2 Anti de Sitter spacetime Ad S¢ represented by a conformal diagram 
consisting of a strip extending from —oo to +00 in the time coordinate r. 


Light travels at +45° to the vertical. Each point in this 2-dimensional plot 
describes a sphere S4-2 of radius sin w, which varies from 0 to 1. 
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Note that y, which plays the role of a latitude, is a spatial coordinate, in contrast to its 
analog t in (IX.10.43). As r (or sinh p) goes from 0 to oo, w goes from 0 to 7/2. 
Pay attention! What, 2/2, not 2? 


Anti de Sitter spacetime has a boundary 


Perhaps the existence of a boundary is the most striking feature of anti de Sitter spacetime, 
a feature that has been much exploited to contain gravity, as was mentioned at the start of 
this chapter. We will first show the boundary mathematically, and then more physically. 
In getting to the form ds? = = 7 (—dt? + dy? + sin* y dQ2_,) in (12), we changed 
variables by setting r = tan w. The radial coordinate r started out nice and easy, ranging 


from 0 to +oo. But now comes something interesting, as just noted: with a seemingly 
innocuous change of variable, we have a latitude y that starts from 0 but gets up to only 
x /2, not x! Starting at the north pole, it reaches only the equator, not the south pole. What 
kind of a weakling latitude is that! 

Thus, while it is locally correct to write dQ2_, for dw? + sin* y dQ2_, in (12), it is 
misleading. Space covers only the northern hemisphere of S¢~!, with a boundary at 
the equator. In other words, we don’t have the full sphere S¢~!. Rather, we have only a 
hemisphere, which is topologically the same as a (d — 1)-dimensional generalized disk or 
ball B4—!. It may be helpful to think of a familiar example: the northern hemisphere of the 
ordinary sphere S$? is topologically the 2-dimensional disk, otherwise known as the ball B?, 
with the equator as a boundary. (The layperson, in his or her infinite wisdom, understands 
by the word “ball” the 3-dimensional object B?, which has S? as its boundary. Similarly, 
the 2-dimensional ball B? is the disk, with the circle S! as its boundary.) 

Thus, the spatial sections of AdS¢ are bounded by S¢~?, which may be thought of as 
Euclidean space E¢~? with spatial infinity identified as a single point. Adding back the 
time coordinate, we extend E4~* to M¢~*!. We finally conclude that the anti de Sitter 
spacetime AdS“ is bounded by M¢~?:1. (In appendix 1, I give youa slightly more intuitive 
demonstration of this statement. In appendix 3, you will see that AdS? has 2 boundaries. 
Can you figure it out now?) 

Hence the slang expression “tin can” for anti de Sitter spacetime and the explosive 
development of the AdS/CFT correspondence. In particular, in this picture, we are to regard 
the good old M*:! spacetime we live and play in as the boundary of a (4 + 1)-dimensional 
AdS? spacetime. Wouldn’t that be quite a laugh, if all this time we were actually living on 
the boundary of Ad S° without knowing it? Sort of like the time when we realized that we 
were not living on E?, but actually on the boundary of B?. 

For a more physical argument that a spatial boundary exists, picture the strip of paper 


we got from unrolling a roll of paper towels a short while ago, as shown in figure 1. The 
dr2 
a 
other words, the line element in (8) with the angular component dropped), appears to be 


strip of paper, meant to represent AdS* and described by ds* = —(1+ r?)dt? + (in 


infinitely wide. It is important to note that instead of the embedding for AdS? in (9), we 
now have T= /1+r?cost, W=V1+4r?sint, and X =r, and so the coordinate r, no 
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longer a radial variable, actually ranges from —oo to ov. (This is why AdS? is special, as 
we have alluded to; see appendix 3.) 

But who is to say that the strip of paper is infinitely wide? As usual in relativity, we are to 
bounce light rays around to measure distances. Sitting at some fixed r,,, let us send a light 
beam to r = 00 and wait for it to bounce back. Light follows the path dt = +dr/(1+ r?) 


and so comes back after the amount of proper time 2,/1+ r2 i dr/(1+r?) for us. The 
important point is that the integral converges at the upper limit, and so the paper is actually 
finite in width, with two edges or boundaries. For Ad S“ with d > 2, the angular coordinates 
connect the “different” boundaries, and so AdS@ has only one boundary. In other words, 
in the discussion above, we concluded that the spatial sections of AdS“ are bounded by 
54-2, but S° consists of two points. 


The conformal group of the bulk equals the isometry group of the boundary 


With what little we know, we can already see a key group-theoretic feature that underlies 
the AdS/CFT correspondence. In chapter IX.9, we learned that the conformal group for 
M>1is SO(4, 2). But earlier in this chapter, we learned that the isometry group for AdS° 
is also SO(4, 2). The conformal group is the manifestation of the isometry group on the 
boundary. 


Poincaré coordinates 


As we have already seen, anti de Sitter spacetime can be described using a variety of 
coordinates. We now turn to the Poincaré coordinates, which in recent years have enjoyed 
a resurgence of popularity, particularly in the context of AdS/CFT. 

Slightly rewrite the defining equation (6) for Ad S? as (T* — X?) + (W? — ¥*) = 1, which 
we solve by writing? T? — X* = vos and W2 — ¥2=1+2>", and 


Ww 


fS-4 a 
W W 
1 (x2—t? 1 1 
Y= (: +w \= (x? ?+w>—1) 
2 w w 2w 
1 (x2—t? 1 1 
W= (: +w+ )- (x? P+ w?+1) (13) 
2 w w 2w 


Note also that we start with four coordinates (T, X, Y, W) and end with three (t, x, w), 
since we are embedding a 3-dimensional spacetime into the 4-dimensional M??. 

Direct substitution of the seemingly awkward coordinate transformation (13) into ds* = 
—dT? + dX? — dW* + dY? leads to the amazingly simple form (do check it!) 


1 


ds? = — (-d?? +x? + dw?) (14) 
WwW 


You may recognize that this is just the Minkowskian version of the Poincaré half plane 
(introduced back in chapter I.5) in one higher dimension. 
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Incidentally, we could have avoided some labor by noting that (13) is invariant under 
the scaling (t, x, w) > A(t, x, w) and under a Lorentz transformation on (t, x), and thus 
(14) must be invariant under these same transformations. 

Again, as in the corresponding discussion for de Sitter spacetime, it is a good idea to go 
to light cone coordinates for the pseudo-time coordinate W and the last spatial embedding 
coordinate Y: 

1 


1 
Wrew+Y=—(-P)+w, WosW-Y=— (15) 
W W 


You can now see relatively simply that the defining equation T* — X*-+ Wt W7 = 1and 
(14) are satisfied. 

It is now easy to generalize, going up in dimension. For AdS*, write T = £, x=F, 
Y= 2, andWt*=W+Z= F(a? + y? —-?)+w,W-=W-Z= i. By now, it is almost 
immediate that AdS? is described by* 


L2 
ds? = (-ar? + dx? +dy?+dz*+4 dw?) (16) 
W 


Basically, given (14), the preceding follows almost trivially: y and z are just going along 
for the ride. We have restored L by dimensional analysis. Just like the Poincaré half plane, 
Ad S° has a spatial boundary at w = 0. (In contrast, dS* as coordinatized in (IX.10.26) has 
a temporal boundary at uv = 0.) 

From the very simple form of the metric, you can see that the Christoffel symbol I’, and 
the Riemann curvature tensor R’,, ~ 3.7:,+'.P:, go like ~ 1/w and ~ 1/w”, respectively, 
and thus vanish? as w > oo. 

A slice of this 5-dimensional spacetime at some specific value of w, say w,,, with the 
metric ds? = 4 (—dt? + dx* + dy* + dz’), isjustthe familiar 4-dimensional Minkowskian 
spacetime! See figure 3. (Contrast and compare with the exponentially expanding universe 
in (IX.10.23).) 

Consider an object, for example a human, of physical size AJ measured from head to 
toes, lined up along the x-axis, say. Then his head is separated from his toes by Ax = w,, Al. 
As w,, decreases toward the boundary at w = 0, the coordinate size Ax of the object shrinks. 

Transforming coordinates w = L?/r, we obtain the alternative form 


2 2 
ds? = (-ar? + dx? + dy? + dz”) + ae 
L2 r2 
1 s ; 
= (-Par + 3A") +r°dx? (setting L = 1) (17) 
r 


The boundary is at r = ov, as indicated in figure 4. Note also that the metric is invariant 
under the scaling? t > At, ¥ > AX, andr > AT!r. 


* In the literature, the notation ds? = a (—dt? + dx? + dz’) is often used, with z, a letter we already used for 
something else, denoting the coordinate “perpendicular” to the boundary. 

T The scalar curvature is of course constant. 

+ In recent applications to condensed matter physics (in an area of research known as AdS/CMP), which is in 
general not Lorentz invariant, the metric ds* = (—r?7dr? 4 5 dr?) + r2dX? is used. The real number z is known 


as a dynamical exponent. The time coordinate now scales as t > A7t. 
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Figure 3 A slice of AdS“ at some specific value of w is the 
(d — 1)-dimensional Minkowskian spacetime M¢-?"!. 


Ww 
r=0 ———_ > |=" 
r 
IR UV 


Figure 4 The head and toes ofa human of physical size A/ located 
at w,, and lined up along the x-axis are separated by Ax = w, Al. 
IR = infrared; UV = ultraviolet. 


You might have noticed that combinations such as w + 4 in (13) are practically begging 
us to write w as an exponential and introduce cosh and sinh. Indeed, set w = Le“/“ in (16) 
and write 


ds? =e" % (—dr? + dx? + dy? +z”) + dw? (18) 


Note that the coordinate u ranges from —oo to +00. We will use this form in chapter X.2 
in our discussion of brane worlds. 

By now it is clear how the Poincaré coordinates are to be defined for AdS“. (Compare 
and contrast with what we had for dS“.) Let 
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Wiurn 


<— w=0 


Ww 


Figure 5 A massive particle moving in the (t-w) plane toward 
the boundary cannot reach the boundary, but turns back at some 
Wtum determined by its initial position and speed. 


XY = etl 


xt = x? 4 RO ey ene + el! 


X-= x¢ +4 xe lt eo! (19) 
Here the indices jz, p, and o run over 0, 1, ---, d — 2. The defining equation n,,,X"X" — 
X*X~ =-—1 is satisfied. Indeed, we do not even have to compute ds? = n,,,dX"dX" — 


dX+dX~. Various symmetries essentially fix it to be ds? = e~" 


Haye ax” du*, as was 
already mentioned. For example, as noted, the metric must be invariant under scaling x” 
and translating uw, and under Lorentz transformations of x”. The splitting of the embedding 
coordinates into the two sets X“ and (X%, X4~1) reflects the two subgroups SO(d — 2, 1) 


and SO(1, 1) contained in the isometry group SO(d — 1, 2). 


Motion of light and massive particle in anti de Sitter spacetime 


The Poincaré coordinates in (16) are particularly suitable for studying the motion of light 
and particles in anti de Sitter spacetime. That it is conformally equivalent to the Minkowski 
metric d5? = (—dt* + dx” + dy* + dz* + dw’) means that light follows a path determined 
by ds? = 0 = d5?. A light beam sent by an observer located at w = wy toward the boundary 
at w = 0 will come back, if a mirror were placed appropriately, to her after coordinate 
time treturn = 2Wo. In contrast, consider a massive particle moving* in the (t-w) plane 


toward the boundary, as shown in figure 5. The definition of proper time g,,, an a =-1 
gives us (4) - (ey? = w’. The isometry under t > t+ constant gives as usual the 


. Z 2 ; 
conservation law 4 (w a dt) = 0, and hence a = 4, for some constant b. We thus obtain 


(ey + ue = 1. The potential in the analog Newtonian problem is thus V(w) = +2 = and 


Ww 


* You might realize that this is basically the same problem as one we worked out in chapter II.2. 
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we see that the particle cannot reach the boundary but turns back at win = b, with b 
determined by its initial position and speed. 

This simple exercise in classical general relativity already hints at a key feature that 
makes possible AdS/CFT. At first sight, it seems impossible that the physics of a (3 + 1)- 
dimensional theory could be mapped onto the physics of a (4 + 1)-dimensional theory. It 
turns out that particles carrying different energies in the boundary theory correspond to 
particles located at different positions along the w-axis in the bulk. The boundary theory 
is able to “grow” a spatial coordinate orthogonal to it. 

By this point, you might be wondering if there is an analogous story for de Sitter 
spacetime. The answer is that some theoretical physicists are working intensively on 
establishing a dS/CFT correspondence. Intriguingly, de Sitter spacetime has a temporal 
boundary, as you may recall from the preceding chapter, rather than a spatial boundary, 
and so, if some kind of dS/CFT correspondence does in fact hold, the boundary theory 
would have to grow a temporal coordinate. Might this shed some light on the origin of 
time? Note that the boundary theory does not contain time, and thus represents some 
kind of statistical mechanics rather than a dynamical field theory. 


Other forms of anti de Sitter spacetime 


Just like de Sitter spacetime, anti de Sitter spacetime can be written in a bewildering variety 
of forms, as we have already mentioned. 

We obtain an interesting form by changing the variable r = 2¢/(1— ¢7) in (8). Then 
14-r7=[(14 ¢7)/(— ¢%)P and dr = 2((1 + ¢7)/(1 — ¢*)*d¢. We thus obtain the alterna- 
tive form 

phe (1402) dr? +4 (a0? " 57d02_,) = (1407)* dt? +.48;dx'dx/ 20 

st = G2) = 2p (20) 
where ¢? = 6;;x'x/. We have a ball defined by ¢ < 1, with its boundary at ¢ = 1. 
To derive the next form, set d = 3 for definiteness and write r? = X* + Y”. The defining 


equation T* — r? + W? =1 invites us to define T = p sinh x and r = p cosh x, so that 
dT* — dr? = pdx? — dp”. (You may recognize this as the Rindler coordinates that you 
worked on way back in exercise III.3.2 and that we discussed in chapter VII.3.) The 
defining equation then becomes W? = 1+ p*. Differentiating, we have WdW = pdp, so 
that dW? = aap’. We end up with 


ds* =—p*dy? + ( dp? + p? cosh” x av?) 


2 

= — sinh? Wdx? + dy? + sinh” w cosh? x do? (21) 
where p = sinh vy, so that we now have 

T=sinhysinhy, r=sinhwcoshyx, W =coshw (22) 


For another form, let 


T =sint cosh x, X =rcos6, Y=rsin0, W =cost (23) 
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with r = sin t sinh x. Then we obtain 
ds? = —di? + sin’ (dx? + sinh x7d6?) = —dr? + sin? dH} (24) 


More generally, ds? = —dt? + sin’ t dH? , for AdS¢. 


Anti de Sitter spacetime in hyperbolic coordinates 


Again for definiteness, let us consider Ad S*. The trick is to rewrite the defining equation (6) 

and the metric (7) as (T* — X”) + (W? — ¥*) =1and ds* = (-dT? + dX’) + (—-dW? + 

dY*), respectively. Let T= Rcosht, X = Rsinht, W=rcoshy, and Y =r sinh y. We 
2 


have ds? = d R* — R7dt? + dr* — r2dw*. Also, R* = 1—r*, RdR = —rdr, dR? = —,dr’, 


j—r2 
and dR* + dr2= —adr?. Hence we obtain 


2 2 3 ae? Shy 
ds*=—(r?—1) d+ 5 — + ray (25) 
Note that this requires an analytic continuation: the metric (25) only makes sense for 
r > 1, which requires R to be negative. This construction is readily generalized to Ad S“: 
ds? = —(r? —1)dt? + ae +r? dH}_,. (As in the preceding chapter, dH? denotes the 
hyperbolic line element defined in chapter 1.6 that has already appeared in de Sitter 
spacetime.) Compare with (8) and (IX.10.35). In particular, for Ad S*, we have 


dr? 
r2—1 


ds? = — (? = 1) dt? + +r (ay? + sinh? vdy") (26) 


Stereographic projection for anti de Sitter spacetime 


As for de Sitter spacetime, we can stereographically project anti de Sitter spacetime by 
mapping (X°, X1, x2, X3, X4) into (x°, x}, x2, x3) as follows (reinstating L): 


XM ox", M=0,1,2,3 (27) 
x 
= ae 
and 
2 
1+ 45 
xtaL ( a (28) 
7-46 
AL? 


where as before, x2 = —(x°)* + (x)? + (x2)? + (x3)?. Now (27) says that 7 
(X?)? + (X3)? = 7 x Verify that the defining relation (1) is satisfied and that 


x2 * 
~ 4L2 
2 
2 1 Pa 
ds° = — a Nyydx" dx (29) 
~ 4p? 


Anti de Sitter spacetime is conformally flat, just like de Sitter spacetime (see the table). No 
surprise there. 
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One of my students speaks up at this point. “You shouldn’t say that,” he says. “Everybody 
knows that the sphere is not flat; that it is conformally flat is kind of a surprise.” But then, 
since de Sitter spacetime is sort of a Minkowskian sphere, it is arguably less surprising 
that it is also conformally flat. But that anti de Sitter spacetime is also conformally flat, now 
that, to him, is surprising. I suppose that it is fair to conclude that everybody has different 
thresholds for being surprised.!° 


Appendix 1: Euclidean anti de Sitter space and its boundary 


At first sight, the appearance of a boundary in anti de Sitter spacetime is rather puzzling (but perhaps not so 
much to readers of this text, since we have already discussed the Poincaré half plane in part I). As far as I know, 
it is easiest to visualize this boundary if, using the stereographic coordinates in (29), we go to Euclidean anti 
de Sitter space. 

Replace n,,, in (29) by 6,,,, thus going from Minkowski back to Pythagoras, which we can do by formally 
writing X° =iX? and x° = ix’. Set 


Here x? = (x7)? + (x)? + (x2)? + ---+ (x4~1)* denotes the Euclidean square of the d-dimensional vector 
(x7, x1, x2,---,x47}), (A word about notation: Perhaps it would have been more natural to denote x! by x4, 


but this would have led to a potential confusion, since X4¢ and x“ are not directly related.) 
Then (X7)* + (X12 4 (X42 4 ---4 (XE)? = x7/(1 a )?, so that the defining relation 


(YE (x) iv 


for Euclidean anti de Sitter space is satisfied. We obtain 


2 

ds? = ( : 5 (dx? )? + (dx! + dx’)? +--+ + ax’) (30) 
~ 4p2 

which we can also see by analytically continuing (29). Again, it is not a surprise that the Euclidean anti de Sitter 

space AdS¢ is conformally related to Euclidean space. More importantly in this context, we note that it is 

topologically the Euclidean ball B¢ defined by 


x? <4? (31) 


which, as every child knows, has a boundary described by S¢-!. (The sphere S¢—! is just (d — 1)-dimensional 
Euclidean space E¢—! with infinity identified as a single point.) This is of course the Euclidean version of the 
statement that AdS¢ has M4~2»! for its boundary. The metric tensor, and hence, by the basic theorem about 
maximally symmetric space, the curvature and the Ricci tensor (but not the scalar curvature) all diverge as we 
approach the boundary. 


Appendix 2: Isomorphism between AdS? and SL(2, R) 


The most general 2-by-2 matrix with real entries may be written as 


was T+X Y+Ww 
“\y-w T-Xx 


The condition det l/ = 1 implies T? — X? — Y? + W2 =1. Under multiplication, the set of all 2-by-2 matrices 
with real entries and unit determinant clearly generates a group, known as SL(2, R). 
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What we have just discovered is that AdS? is isomorphic to the universal cover of SL(2, R): there is 1-to-1 
correspondence between points on AdS? and elements of SL(2, R). 

Evidently, for V and Z any two elements of SL(2, R), then LU’ = VUZ also has unit determinant, and thus, 
as an element of SL(2, R), corresponds to another point on AdS°. In other words, the isometry group of AdS? 
is SL(2, R) x SL(2, R). But we know from the text that the isometry group* of AdS? is SO(2, 2), yet this is 
consistent, since SO(2, 2) is in fact (see below) isomorphic to SL(2, R) x SL(2, R). Incidentally, this beautiful 
piece of group theory is relevant to the recent use of twistors! to calculate scattering amplitudes in quantum 
field theory. 

We now exhibit the isomorphism SO(2, 2) = SL(2, R) x SL(2, R) explicitly. In chapter 1X.9, we saw that the 6 
generators of SO(2, 2), d*, x*04, —(x )2a.., break into two mutually commuting sets, evidently corresponding 
to 2 copies of SL(2, R). Consider an element J + A of SL(2, R) close to the identity. Using an identity we 
have encountered repeatedly, we evaluate its determinant to be det(J + A) = eft logd+A) ~ 14 TrA. Thus, the 
generators of SL(2, R) consist of 2-by-2 traceless matrices, among which we choose the linearly independent set 


na(h Se nee ne(89) : 


You can verify that the desired identification is 


a~T_, xO~54T3; —x?0~T, (33) 


Note the minus sign, as determined in the preceding chapter. 


Appendix 3: AdS? and its two boundaries 


Readers into group theory know that “smaller” groups often exhibit special features that the “larger” groups 
do not have. Similarly, AdS* differs from its higher dimensional counterparts by having two boundaries. This 
apparently puzzling assertion actually follows from an extremely elementary geometric fact. Consider the unit 
Euclidean ball B? defined by Ese J)? < 1. The boundary of the disk B? is the circle $1, of the everyday ball 


B3 is the sphere S 2. and so on. But the boundary of B!, namely S 0 consists of two disconnected points. (Indeed, 
we already encountered this phenomenon in chapter VII.2.) 
The boundary of AdS¢ is more visible with some coordinate choices than with others. In particular, with the 
coordinates used in (12), we have for Ad S* 
2 1 2 2 
2 (—ar +dy ) (34) 


~ cos? w 


The crucial difference with (12) is that now w runs from —7/2 to 1/2. The rectangular strip that describes Ad S? 
in the (t-yy) plane obviously has two boundaries at y = +7/2. (For the coordinates used in (10), with sinh p = 
tan y, the two boundaries are at o = too. In terms of the original embedding coordinates T = cosh p cost, 
W =coshp sint, and X = sinh p, the boundaries correspond to the two end caps of the tube in figure 1 
at X = +00.) 

The presence of the two boundaries is less transparent in Poincaré coordinates: 


1 


ds? = 7 (-ar? + dw?) (35) 


However, inspection of the coordinate transformation T = at W= i (—12 + w* + 1), and X = 55 (—12 + w* — 
1) reveals that it is actually at w = 0*. Or, with r = w7!, as in the text, we have ds? = —r2dt? + r~2dr2. Note that 
—0oO <r<oo. 

It is also instructive to relate the Poincaré coordinates to those used in (34) (but before we do that, we have to 
rename the time coordinate in the latter as t). We find 


sin y + sin t sin y + sin t 
pe aa ge OS T (36) 
cos? wy — cos? t cos? yy — cos? t 
From (36), we also see that w = -too corresponds to the two lines y = t and w= —T. 


* In comparison, the isometry group of dS? is SO (3, 1) = SL(2, C)/Z>.? 
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Referring to the table contrasting isometry groups for de Sitter and anti de Sitter spacetimes given earlier in 
this chapter, we also observe the amusing fact that the isometry group SO (1, 2) for AdS? and the isometry group 
SO(2, 1) for dS* are actually the same, a fact that we can see geometrically from figures IX.10.1 and 1. 


Appendix 4: Continuing from de Sitter to anti de Sitter spacetimes 


From the defining equations (1) and (2), we see that formally we can go from de Sitter to anti de Sitter spacetime 
by analytically continuing L —> iL and X¢ — iX“. (By now, X¢ is perhaps better known to us as W.) Alternatively, 
instead of thinking about the embedding spacetimes, we can consider a specific set of coordinates x“ and solve 
Einstein’s field equation. Evidently, a solution of R,,, = — (=) 


7Z Su» Which after all is a bunch of coupled partial (or 
ordinary) differential equations, becomes a solution of R,,,, = + ool 8» When we formally set L + iL. However, 
this procedure may or may not result in a spacetime (see below), and further continuations in the coordinates 


x will in general be needed. 


; : ; a ; 
Let us see how this works in a few cases. For example, taking the first entry ds? = — eae dt? + t*?dH} in 


the table for de Sitter spacetime in the preceding chapter, we plug in L > iL and encounter no trouble, thus 
reproducing an entry in the table for anti de Sitter spacetime given in this chapter. In contrast, taking the second 
2:1? 
enity, ds ™ cos? 6 : ; : ; fas: 
coordinates and 1 space coordinate. Thus, we obviously have to analytically continue 6 also, 0 — ip, thus obtaining 


(—d0? + sin? 6 dH?) and flipping the sign of L?, we would encounter something with 3 time 


ds? = —— (—dp* + sinh? p dH2). (Useful identities in this context: cosh ix = cos x, sinh ix =i sin x.) Another 
cosh? 3 
p 


approach is to use coordinates with dimension of length. So, first write 9 = $ and the de Sitter metric as 
ds? = = > (—dp? + L? sin? F dH}). Then continue L > iL. 
ba 


Another way of saying this is that if we use dimensionless coordinates, such as angles cyclic and hyperbolic, 
then by dimensional analysis, ds? has to be proportional to L*, and so are the metric components g uv: Le we flip 
the sign of the metric, the Christoffel symbol T’:, ~ g"'dg.., the Riemann curvature tensor (with one upper and 
three lower indices) R",., and the Ricci tensor R.. do not flip, but the scalar curvature R does. Einstein’s field 
equation R.. ~ L~g.. flips between de Sitter and anti de Sitter spacetime, as it must. 


Appendix 5: Geodesics in the embedding space 


As in appendix 3 in chapter 1X.10, we can discuss geodesics as visualized with the embedding coordinates 
X™ satisfying X* = nyyX™X% =—1. Go through the same steps as in that appendix, introducing a Lagrange 
multiplier and so forth but keeping in mind that in the present case, 7 = (— ++---+-—). In particular, in 
the embedding space* a photon zips merrily along a straight line in the sense that it follows ¥ = 0. As in the 
preceding chapter, we can verify that the geodesics in the embedding space do describe geodesics in whatever 
coordinates are used to map anti de Sitter spacetime. 

Confusio mumbles that, since he now understands how this went in the preceding chapter, there is no sense 
in checking it again. We respond that since we used a rather unnatural looking transformation (13) to define the 
Poincaré coordinates, we think that it is still fun to see how the laws of arithmetic work. A photon, for instance, 
traces out 


X=at+be (37) 


with a? 1, b?=0, a-b 0 (to ensure that X*=-1). 

Consider a photon moving along the x-axis. From ds* = ay (—dt? + dx* + dw’), we have dx =dt, x =1, 
and w = w, (with w,, arbitrary). Note that anti de Sitter spacetime is translation invariant in t and x but not in 
w. Referring to (13), we translate x = t into X = (2r, 2r, w? —1, w2 + 1)/(2w,,), so that a = (0, 0, w2 —1, w2 + 
1)/(2w,.) and b = (1, 1, 0, 0). Note that the normalization of b can be absorbed into the definition of ¢. We see 
that indeed, a2 = —1, b? =0, anda-b=0. 

Confusio: “You seem to use ‘translate’ in two senses.” 

Yes, literally and metaphorically. 


* You might have noticed that I avoid using the term “embedding spacetime” in this chapter, since I wouldn't 
know what to call a space with two time coordinates. 
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Work out another case for fun. Let the photon move along the w-axis, so that w = t, which translates into 
X = (1,0, (2t)~!, —(2t)~4). Remarkably, T = X° stays constant. But the other “time” W is not standing still! 
Thus, a = (1, 0, 0, 0) and b = (0, 0, 1, —1). Indeed, a# = -1, b*? = 0, anda-b=0. No surprise that the laws of 
arithmetic work, even in general relativity! 


Appendix 6: Scalar field in AdS/CFT 


While the AdS/CFT correspondence is beyond the scope of this text, we can mention that one important 
step involves solving the equation of motion for a scalar field (of mass m) in AdS**1, namely (Q— m”)¢ = 
3 I (/— 88" dp) — m?@ = 0. And this the devoted reader who has gotten this far is able to do. 


Referring to the metric ds* = (—r?dt* 4dr’) + r2dx given in (17), we have —g=r2@-) and g" = 
dr, gra 1/r?, +++, but g”= r2. Thus, [J contains terms like (—a? + a2 +.. )/r?, terms which may be 
neglected near the boundary at r = oo. In contrast, g’” grows near the boundary. Hence, the equation of motion 
near the boundary reduces to 


pda (“*10,0) ~ mo =0 (38) 


This equation, homogeneous in powers of r, can be solved by plugging in ¢ ~ r*. We obtain a quadratic equation 
with the roots A — d and —A, where 


d d\’ 
A=>t (5) +m? (39) 


Thus, we obtain 
b(r ~00,t, ¥) =a(t, ¥)r4~4 + Blt, H)r~4 (40) 


Since A — d > 0, we have to impose the boundary condition a(t, x) = 0. The AdS/CFT correspondence states 
that the expectation value of a certain quantum field theoretic operator living on the boundary of AdS“+? is then 
given by B(t, x). As for the question why oh why, the answer is not contained in this textbook. 


Appendix 7: Coset manifolds and the classification of space and spacetime 


By the end of the 19th century, it was understood! that space could be Euclidean, spherical, or hyperbolic. In the 
language of coset manifolds, we can start with two empirical observations and arrive at these three possibilities. 
The isotropy of space implies that space is of the form G/SO(3). The 3-dimensionality of space implies that 
G must have 3 + 3 generators. There are 3 groups with 6 generators, namely G = E(3) (the Euclidean group 
consisting of rotations and translations), G = SO(4), and G = SO(3, 1), corresponding to Euclidean, spherical, 
and hyperbolic, respectively. Note the appearance in this context of the Lorentz group long before special relativity! 

Now let us generalize this discussion to spacetime. We know that spacetime is Lorentz invariant and 
4-dimensional. Thus, if spacetime is homogeneous, it should be of the form G/SO(3, 1), with G having 
6+4=10 generators. Again, there are 3 possibilities, namely G = E(3, 1) (the Poincaré group consisting 
of Lorentz transformations and translations), G = SO(4, 1), and G = SO(3, 2), corresponding to Minkowski, 
de Sitter, and anti de Sitter spacetime, respectively. 


Notes 


1. For example, P. A. M. Dirac, “The Electron Wave Equation in de-Sitter Space,” Annals Math. (1935), p. 657, 
and “A Remarkable Representation of the 3+2 de Sitter Group,” J. Math. Phys. (1963), p. 901. 

2. Strictly speaking, M?:! constitutes a patch of the boundary, which globally has the topology of R x S?. Recall 
the discussion in appendix 5 of chapter VII.2. 

3. This aspect of anti de Sitter spacetime was emphasized by S. Hawking and D. Page, Comm. Math. Phys. 87 
(1983), p. 577. 


12. 
13. 
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. The reader is referred to many excellent reviews, in particular, O. Aharony, S. Gubser, J. Maldacena, 


H. Ooguri, and Y. Oz, Phys. Reports 323 (2000), p. 183. For applications to condensed matter physics, see 
various reviews by S. Hartnoll, by J. McGreavy, and by C. Herzog. 


. For general overviews, see J. Maldacena, “The Illusion of Gravity,” Scientific American, November 2005, p. 57; 


and C. V. Johnson and P. Steinberg, Physics Today, May 2010, p. 29. 


. Have you ever wondered what a world with two time coordinates would be like? Science fiction writers have 


long played with time travel. You could be the first to toy with two times! 


. There have been speculations, of course, about more than one time. See papers by I. Bars, and by G. R. Dvali, 


G. Gabadadze, and G. Senjanovic. 


. Again, nothing prevents people from speculating about time travel and the like. Look up on the web the 


discussions surrounding the chronology protection conjecture. 


. See appendix 1 in the preceding chapter. 
. This reminds me of a story about a famous dictionary maker. 
. For an elementary introduction to this fascinating subject, see, for example, QFT Nutshell, chapter N.3 and 


appendix B. 

QFT Nutshell, p. 532. 

H. Helmholtz, “The Origin and Meaning of Geometrical Axioms,” Mind 1 (1876), pp. 301-321, 
http://www.jstor.org/stable/2246591. 


Recap to Part IX 


Part IX consists of a collection of topics that hopefully amuses and amazes. 

Transporting a vector by keeping it parallel to itself and eventually returning to its 
starting point tells us about curvature. A precessing gyroscope realizes parallel transport. 
How parallel straight lines approach or move away from each other also tells us about 
curvature. 

Linearizing Einstein’s field equation, we find that ripples in spacetime propagate as 
gravitational waves. Since a gravitational wave carries energy momentum, it inevitably acts 
upon itself. Starting with the linear theory, we soon discover that this inherent nonlinearity 
of gravity leads us to Einstein’s theory. 

The language for describing the symmetries of spacetime, often obscured by the co- 
ordinate choice, is developed through the notions of Killing vectors and isometry. 

Just as the Lorentz algebra can be extended to the Poincaré algebra, the Poincaré algebra 
can be extended to the conformal algebra. 

Differential forms provide us with a powerful method of calculating curvature. 

For the same reasons that theoretical physicists like the circle and the sphere, we like 
de Sitter and anti de Sitter spacetimes. 
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Katuza, Klein, and the Flowering 
of Higher Dimensions 


More than a new continent 


Yet | exist in the hope that these memoirs, in some manner, | 
know not how, may find their way to the minds of humanity in 
Some Dimension, and may stir up a race of rebels who shall 
refuse to be confined to limited Dimensionality. 


—narrator in E. A. Abbott’s Flatland 


Minkowsk’s innovative geometric view of special relativity as a 4-dimensional spacetime 
bringing together space and time was so obviously true that it was quickly accepted. In 
1919, the German-Polish physicist Theodor Katuza wrote to Einstein to say that he had 
added another dimension and moved up into a more spacious spacetime. Einstein was 
quite taken by this brilliant idea, but it did not prevent him from sitting on Katuza’s paper 
for a year! before sending it to the Prussian Academy for publication in 1921. Einstein 
confessed, perhaps somewhat ruefully, that dimensions higher than 4 = 3 + 1 had never 
occurred to him. 

Katuza certainly recognized the far-reaching implication of his suggestion: the paper 
was titled “On the Problem of Unity in Physics.” He managed to unify the two established 
interactions known at the time: gravity and electromagnetism. The Swedish physicist 
Oskar Klein, known for his many contributions” to physics, rediscovered’ the Katuza 
theory* in 1926 and later developed the theory further. Since then, the subject has been 
vastly extended and generalized. My strategy is to give, for the sake of pedagogical clarity, 
an overview of the essential concepts involved. To keep things as simple as possible, I focus 
mostly on the 5-dimensional case and relegate assorted technical details to appendices. 
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The profound idea of transformation 


Of the several ways to motivate Katuza’s idea, I choose to start with the profound idea of 
transformation (instead of a brute force approach I will describe later). Theoretical physi- 
cists stand in awe of the fact that, insisting on invariance under gauge transformation, we 
are led to electromagnetism and, insisting on invariance under general coordinate transfor- 
mation, we are led to Einstein gravity. Furthermore, generalizing* gauge transformation, 
particle physicists were led to the strong interaction and unification of the electromagnetic 
and weak interactions into a single electroweak interaction, which in turn opens the door 
to unifying all three nongravitational interactions—strong, weak, and electromagnetic— 
into a grand unified theory, as already mentioned in chapter VIII.3. Amazingly, insisting 
on invariance under these transformations has almost magically led to an understanding 
of the physical world. 

For now, we go back to Katuza’s attempt to unify gravity and electromagnetism. Under 
an electromagnetic gauge transformation, the electromagnetic potential transforms by 


M im uM ue (1) 


In contrast, under a coordinate transformation x“ — x’", the metric transforms by 


Bs se= Swe axe. At first sight, these two transformations look totally different. 


But, as we saw in chapter IX.4, in the weak field limit, under the infinitesimal coordinate 


transformation x > x/! = x" + e/(x), the field A — Nyy transforms by 


pv = Suv 


h O,€ Oye 


> h pv Fev ~ Sve (2) 


h ie 
The intriguing resemblance between (1) and (2) is striking to say the least, and almost 
begs for some kind of unification. But how to do it ifh,,,, and A,, don’t even carry the same 


number of indices? 


An invisible dimension 


You rack your brain for a while, and if you’re as smart as Katuza, you might suddenly 
realize how to do it. Make one of the indices invisible! The desired relation (1) could have 
originated from an equation like (2) involving objects carrying two indices, if somehow 


one index is inert or invisible. 


Katuza’s idea is to add to the existing coordinates x°, x!, x?, and x? an extra coordi- 


nate? x°, Denote the coordinates by X¥ = Ge, x°), with the index M running over 0, 1, 
2,3, and 5. 


* Maxwell theory is thus generalized into Yang-Mills theory. While this fascinating development is largely 
beyond the scope of this text (see, for example, QFT Nut, part VII), I touch upon some aspects of this important 
subject, particularly in the appendices. 

T You can see that the peculiar notation makes historical sense, since the time coordinate was once known as 


. 9 = cr, 


x* = ict before getting renamed as x 
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In 5-dimensional spacetime, Einstein gravity is invariant under XM > Xx’ = XxX” + 
e“(X). Denote the metric by Gyy =nyn + hyn, Where nyy denotes the extension of 
the Minkowski metric to 5-dimensional spacetime, with n;; = +1and n,,5 = n5,, = 0. (Note 
that the sign of 75, indicates that the extra dimension added is spatial, not temporal.) For 
now, we consider the weak field limit, in which (with ¢ the same order as h) 


hun > hin =hyn — 9Mén — ONEM (3) 


under a coordinate transformation, where dy = on: 
With M, N restricted to jz, v, (3) reduces to (2). But with M restricted to x and N set to 
5, (3) becomes 


hys > Ws =hys — Opes — Ose, (4) 


Compare this with (1). 

First, under the usual Lorentz transformation in 4-dimensional spacetime, h us trans- 
forms as a vector, just like A,,. So, let us identify h,,5 as 1A,,, with 1 some length. Since 
G yy is dimensionless and A,, has dimensions of an inverse length, dimensional analysis 
mandates the introduction of /, which sets the normalization of A,,. 

Now look at (4). Do you see what Katuza saw? 

Identify e, as /A. Then, up to an overall factor of J, (4) would become (1) were it not for 
the last term ds¢,, in (4). But we can get rid of this unwanted guy by simply supposing that 
€,, does not depend on x°. In that case, we recover (1)! 

In particular, we can take ¢,, = 0. In other words, the familiar electromagnetic gauge 
transformation we know and love is just the 5-dimensional coordinate transformation 


xa xh, xP =P 4 IA (x) (5) 


Katuza has managed to subsume electromagnetic gauge transformation into a gen- 
eral coordinate transformation! This elegant idea constitutes the essence of Katuza-Klein 
theory. 


The visibility problem and the escape problem 


We are immediately confronted by two closely related problems, the visibility problem 
and the escape problem. How come we don’t see the extra dimensions, in contrast to the 
glaringly obvious three that we deal with all the time? How come we can’t escape, and we 
don’t see any particle escaping, into the extra dimensions? 

A crucial feature of Einstein’s theory, that space can be curled up, almost makes it natural 
to contemplate higher dimensions. Let the fifth coordinate be curled up as a circle of radius 
a, so that x° ranges between 0 and 2za. Each spacetime point around us is actually” a tiny 
circle! 


* At one time, the idea that solid rock may consist of largely empty space with tiny point particles whizzing 
around would have seemed equally fantastic. 
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Figure 1 Ever since our ancestors first thought of space, we have 
been mistaking tiny circles for points. An electromagnetic gauge 
transformation amounts to twisting the circle at each spacetime point 
through a different angle. 


Ifa is much smaller than any length scales our experimental friends have explored, then 
we could all have been fooled ever since our ancestors first thought of space. We have been 
mistaking tiny circles for points! 

This reveals what an electromagnetic gauge transformation actually amounts to: we go 
around twisting the circle at each spacetime point through a different angle (see (5) and 
figure 1). 

This solves the visibility problem, and now Heisenberg with his uncertainty principle 
solves the escape problem for us. For a photon to escape into the fifth dimension and be 
confined there, its momentum would have to be of order p ~ 1/a, which would be huge 
for a small enough. Classically, the frequency of the corresponding electromagnetic wave 
would be enormous. To squeeze* into such a tight space, the photon (or any of our favorite 
particles) would need to be terribly energetic. 

The visibility problem and the escape problem are thus solved in one fell swoop. Indeed, 
if we assume that matter fields, such as the electromagnetic field, do not depend on the 
extra coordinates of the higher dimensions, then in some sense these fields do not “see” 
these hidden dimensions, and physics as we know it could go on happily as before. 


Unifying gravity and electromagnetism 


You might be wondering about hss, which we have yet to talk about. In the weak field limit 
in which we have been working thus far, we obtain from (3) that hs; > hi. = hss — 20sés. 
But to get the electromagnetic gauge transformation to come out, we had taken ¢, in (5) to 
be independent of x°, and so hss = 0 implies h, = 0. Thus, it is consistent to set hss = 0. 


* Note that in 1921, Heisenberg’s uncertainty principle and Schrédinger’s equation were both in the future, 
but this argument can rest only on de Broglie waves. 
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Katuza-Klein theory can be implemented in any spacetime dimension d > 4, but for 
pedagogical clarity, it is best to stick with Katuza’s original d = 5. It is convenient to give 
the internal coordinate x° another name, say y = x°. 

One remarkable feature of Einstein gravity is that the Einstein-Hilbert action Spy = 
+ f d*x./=g M}R(g) can be written, without further ado, in any spacetime dimension. 


So, go ahead, write it for d = 5: 
Ske =+ i d*xdy/—G M3R(G) (6) 


We denote the scalar curvature constructed out of the 5-dimensional metric Gyjy 
by R(G),* to distinguish it from the scalar curvature R(g) constructed out of the 
4-dimensional metric g,,,. Here the mass Ms; is the analog of the Planck mass Mp. To 
fix its power in (6), we invoke dimensional analysis as follows. 

The scalar curvature R (in any spacetime dimension) contains two derivatives acting on 
the metric and hence has dimensions of an inverse length squared. Thus, as explained at 
length in chapter VI.1, we have to multiply { d+x,/—gR(g) by something with dimension 
of inverse length squared to make the Einstein-Hilbert action dimensionless, and we define 
that something as the square of the Planck mass Mp. (In natural units with # = landc = 1, 
mass has dimensions of an inverse length, as was explained way back in the introduction.) 
In 5-dimensional spacetime, we have to multiply { d*xdy/—GR(G) by something with 
dimensions of inverse length cubed, that is, by some mass cubed. Hence M2 in (6). To 
summarize, the mass Ms sets the scale of 5-dimensional gravity, just as Mp sets the scale 
of 4-dimensional gravity. 

Let us see how we recover the familiar 4-dimensional gravity and electromagnetism by 
evaluating the action Sx, for various choices of G yy. 

First, for Gy v(x, y) of the form G,,,(x, y) = 8y)(*), Gus, y) = 0, and Gss(x, y) = 1, 
we have R(G) = R(g) and so the action Sx, reduces to 2naM? fi d*x,/—g R(g), with the 
factor of 27a coming from the integration over y. Identifying 


M; = 2naM3 (7) 


we obtain the good old Einstein-Hilbert action. That Sx, must reduce correctly follows 
from general invariance considerations: the form of Gy, just given is maintained under 
the 4-dimensional general coordinate transformation x“ = x"(x'"), x° =x”. 

Next, consider the weak field configuration G,,,(x, ¥) = Nyy, Gys(x, y) =1A,,(x) (with 
I the length introduced earlier), and Gs55(x, y) = 1. The 4-dimensional action that results 
from plugging this into Sxx contains two derivatives acting on A,, and does not change 
under the gauge transformation (1). This can only be Maxwell’s action (recall (V.6.18)). 
Without doing any computation, we are guaranteed by invariance considerations that 
Maxwell’s action must pop out! 


* If you confused this G with Newton’s constant, go back to square one. 
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In other words, with F,,, = 0,,A, — 0,A,, as always, Sxx must reduce to 
bh in bh YS, OKK 
2naMel*¢ / d*x FF!” = (Mpl)’ ¢ i Cig 2 a cud 


with some unknown numerical coefficient ¢. A detailed calculation (we will do it in 
appendix 1) is required only if we want to determine ¢, which turns out to be — i: Thus, to 
obtain Maxwell’s action S = f d*x(—4)F, wv” as commonly normalized, we set Mpl = 1, 
that is, / =/p. (In other words, from the way A,, was introduced, we are free to let A > 4A 
and | > 1/2. We have now picked a particular normalization for A, thus fixing /.) 


The Katuza-Klein metric 


Thus far, our discussion has been in the weak field expansion discussed in chapter IX.4, 
with Gyy given in two special cases. To construct the Katuza-Klein metric Gyy in its 
full glory, we go back to (1) and note that under a gauge transformation, A,,dx > 
(A, — 0, A)dx" = A,,dx" — dA. (Ina sense, we are going back to chapter IV.1, where 
you allegedly discovered electromagnetism.) Thus, the combination (dy +/A,,dx") is 
gauge invariant if we also transform y > y +/A (as in (5)). You of course realize that we 
are just rewriting the M = 5 component of the coordinate transformation X” > xX’ = 
X™ + e“(x) and rediscovering (4). 

Having made the acquaintance of this invariant combination (dy + 1A,,dx"), we are 
now ready to write down the 5-dimensional line element instantly. By invariance, it can 
only be 


ds* = gyydx"dx" + (dy +1A,dx")” (8) 


Comparing this with ds? = GyydX“dx% = G yydx"dx” + 2G ysdx¥dy + Gssdy*, we 
can read off 


2 
Buy tltA,Ay, TA, 
Gun => (9) 
1A, 1 


In other words, we have constructed the Katuza-Klein metric. Notice that G,,, is equal 
to g,,, only in the weak field expansion implicit in (3). In appendix 7, we give a more 
geometrical derivation of (8) for arbitrary dimensions. See (66). 


Motion in the fifth dimension 


We have yet to fix the radius a of the tiny circles all around us. Here is a clue: our fixing 
the normalization of A,, by setting / = /p is an empty gesture unless A,, actually couples 
to some charged particle or field. Thus far, this is absent from Katuza-Klein theory. It 
behooves us now to work out the motion of a point particle in this theory. Intuitively, since 
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G45 = A, we expect that a particle moving along the fifth coordinate y would sense the 
electromagnetic potential. This turns out to be the case. 

To keep life simple, let the (3 + 1)-dimensional spacetime be flat; after all, we now 
know how point particles move under gravity. The focus here is on the coupling to the 
electromagnetic field. So, set g,,, in (8) to 7,,, and write down the action 


1 
Sparice=—m f [-nysdtdx + (dy + 1A,dx")] (10) 


Apply what you learned in part II of this book: vary S 


particle 
obtain the equations of motion for the particle. First, since the metric does not depend on 


with respect to y and x" to 


y explicitly, we have the conservation law 4 (2 +1A, dz") = 0, so that 


dy dx 
= +1A 11 
ee (2 K dt ) my 


is a constant. Indeed, you recognize p as the conserved momentum in the y direction. 
Next, varying with respect to x”, we obtain, after using (11), 


d axl dx’ 
dt (rm + (rb)Ay) = (pha, A, (12) 


As in chapter IV.1, this equation of motion reduces to mex = (pl)F wax Comparing 
with (IV.1.23), we see that pl = q is the charge of the particle. Our intuition is vindicated: 
the momentum p along the y direction determines how strongly the particle senses A_,,. 

But what is p? Classically, the momentum p can take on any value, and hence the charge 
q also. In quantum mechanics, however, the momentum of a particle moving around a 
circle is quantized. Earlier, we already noted that the uncertainty principle implies that 
p is of order 1/a. But we can be more precise. Here I have to assume that the reader is 
sufficiently familiar with quantum mechanics to know that the wave function of a particle 
around a circle of radius a is given by e'?>/", The reader who does not know this can 
simply skip this and the next section. (Note that we have momentarily restored hi.) Since y 


and y + 2za represent the same point, e!274?// 


= 1, and so p must take on the quantized 
value p = nii/a, with n any integer between —oo and +00. Since g = pl, we find that the 
charge of the particle is also quantized to have the allowed values q, = niil/a = ne, with 


the fundamental unit of charge given by 
(13) 


(we set i = 1 once again). In natural units, e ~ \/41/137 ~ 0.3. Since | = Ip, we find 
that a = Ip/e ~ 3lp is also of the order of the Planck length. Nicely, this accords with the 
undeniable fact that experimentalists have not seen the Katuza- Klein circles to this very day. 
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With the internal space a circle, we are invited to introduce an angular variable @ defined 
by y =a0. You are of course free to think of 6, rather than y, as the internal coordinate 
and rewrite (8) as* ds? = g,,,dxdx" + a?(d0 + eA,,dx")?. 


Charge conjugation and direction of motion 


That charge quantization pops out of the Katuza-Klein framework is quite striking. Another 
nice feature of the theory is that the concept of charge conjugation, and hence of antimatter, 
also emerges naturally. This remark is directed to those readers who know about the three 
fundamental discrete symmetries of physics, namely parity (that is, reflection in space), 
time reversal (that is, reflection in time), and charge conjugation (that is, turning particles 
into antiparticles, thus flipping the sign of various conserved charges). Of these three 
discrete symmetries, charge conjugation stands apart from parity and time reversal in 
that it does not appear to have anything to do with spacetime. But in the Katuza-Klein 
framework, it does. Charge conjugation just corresponds to reversing the motion of the 
particle along the y direction. The existence of antimatter follows from the possibility of 
going the other way around the circle! 

Unfortunately, there is also a serious difficulty with this basic version of Katuza-Klein 
theory: the charged particles found here all have y momentum of the order of the Planck 
mass Mp. This implies that these particles are all very massive (as we will make precise 
in the next section) and do not correspond to the observed charged particles. Incidentally, 
Katuza already noted this difficulty in his paper, with a note thanking Einstein for pointing 
this out to him. 


Extragalactic fable revisited 


Time to revisit the extragalactic fable in chapter IV.1! Recall that the extragalactic version 
of you tried to include a potential in the Lorentz action for a free particle. You managed 
to come up with two, and only two, options: Sp = — f{m,/—n,,,dx"dx” + V(x)dt} and 
Sg=—m f FE (a+ V0) 442 — dx?. You can put the potential V either outside or inside 
the square root. How could there possibly be another option? 


After staring at this for years, you might, ifyou are as clever as Katuza, realize one day that 
there is a third option unifying these two options. You extend the indices w, v = 0, 1, 2, 3 


toM, N =0, 1, 2, 3, 5; write S=—m f /—Gyydx™dxN; and set G,,, =n, and Gs5 = 1. 
Then 


S=—m / fa tuvdxtdx” ~ 2G5,dx5dx"" — (dx9)2 = —m / jar? ~ 2G5,dx5dx" — (dx5)? 


dx° dx" 
xm f ae (1= 6S) = femur teayat e) 


* Readers familiar with quantum mechanics will be pleased to see the phase angle of the wave function for a 
charged particle emerging. We have e!??/! = e'p49/h — ei" (see, for example, QFT Nut, p. 79). 
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5 sons : 
where in the last step you set ae =e/m and Gs, = A,. This is the Katuza idea in 
essence. 


Katuza-Klein towers 


A striking prediction of Katuza-Klein theory is the existence of entire towers of particles. 
In this section, we set not only g,,, = ,,, but also A,, = 0. Consider the wave equation 
a2 = (ee EE ae He = 0 satisfied by some field © whose identity we need not 


specify for this schematic discussion. The solution is ®(t, x, y) = eiottik-iticy with 
wo = (k2 + 2). 

Here we essentially repeat the argument given in the previous section. Since y is curled 
up in a circle with radius a, the periodic boundary condition? specifies that x = n/a, 
with n an integer. In quantum mechanics, de Broglie tells us that frequency and wave 
number become energy E = hw and momentum p = hk, respectively. (Here p is the 
3-momentum observed in the (3 + 1)-dimensional Minkowskian spacetime, while the 
momentum along the y direction that appeared in the preceding section is p = fix.) Thus, 
in Katuza-Klein theory, each one of the particles we know—the electron, the photon, and so 
on—is associated with a tower of particles with masses given by m2 = E? — p? = (nh/a)’. 
(In appendix 4, we repeat the calculations in these two sections more carefully.) 

For n 4 0, these masses are enormous, of order Mp, since as we saw in the preceding 
section, a is of order /p. The Katuza-Klein towers are conveniently hidden away from the 
prying eyes of our experimental friends. 

On the Planck scale, the known fundamental particles, quarks, leptons, and such, are 
essentially massless; in particle theory, their observed masses* are accounted for by the so- 
called Higgs mechanism. In this interpretation of Katuza-Klein theory, the known particles 
correspond to n = 0, in which case the corresponding field ®(x, y) does not depend on y 
and thus in some sense does not know about the extra dimension. 

We will not address further this difficulty of the theory not containing charged particles 
that are also massless (that is, with masses much less than Mp). 

Note also that we have not inquired about the “mechanism” that breaks the 5-dimen- 
sional spacetime into 4 cosmologically large coordinates and one small coordinate. 


Breathing circles 


We have shown that it is consistent to set hy, = 0 and hence G,, = 1, but actually, we have 
the option of giving life to G;; and turning it into a field. Promote the fixed radius a to a 
field @(x) (which evidently has dimension of a length) so that 


ds? = gy,dx"dx” + p(x)? (dd + eA,dx")’ (14) 


* As was emphasized in chapter III.5, one triumph of special relativity is that it allows us to talk about massless 
particles. 
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(Henceforth, we will absorb e into A,,.) The Katuza-Klein metric is correspondingly 
modified to 


Suv t ¢7A,A, 7A, 
Gun ae 2 2 
oA, o 


(In comparing (15) with (9), note that one is written in the basis (dx, d0), the other 


(15) 


in (dx, dy).) Throughout, g,,,, A,,, and ¢ are all functions of x; we often suppress the 
x dependence to avoid clutter. For the record, the inverse metric is given by 


NP g”r —A" 
GNP (16) 
—AP oe 7+ A? 


where A? = A, A“. 

The scalar field # (x) is sometimes called the dilaton or radion.® We suppose that in the 
ground state, d(x) =a. 

Note that G, the determinant of G yy, is given by G = ¢2g (with g the determinant of 
8,» aS usual), so that ¢ controls the volume of 5-dimensional spacetime. We could perhaps 
visualize this collection of Katuza-Klein circles as an immense colony of minute marine 
organisms breathing, pulsating, and changing in size. 

Note that we have now accounted for all (5 -6)/2=15=10+4 + 1components of G yy: 
10 in g,,,,4in A,, and 1in @. 

At the time of Katuza and Klein, experimentalists had never heard of a scalar field, and 
(x) was seen as a fly in the ointment gluing gravity and electromagnetism together. In 
modern times, however, string theory contains a multitude of scalar fields, in particular 
the dilaton field, and the natural appearance of scalar fields was celebrated with much 
exuberance and joy. Nevertheless, experimentalists have not yet seen the Katuza-Klein 
scalar field. 

For some reason, while it is easy to excite g,,,, and A,,, which we do endlessly each day, 
is very hard to excite. The Katuza-Klein radius is extraordinarily rigid! Why? Nobody knows 
for sure. 

[ alluded earlier to a brute force plug-in approach. You can now see how this alternative 
presentation starts by plugging the Gy, displayed in (15), given without any motivation, 
into the action (6). After some tedious calculations, the action would be found to reduce to 
Einstein- Hilbert plus Maxwell plus an action for ¢. I personally find this sort of approach 
not particularly illuminating. As to the inevitable question of why we should contemplate 
the form given in (15), the answer could be that we can always start with a general Gyw, 
demand that the action reduce to a nice form, and by trial and error arrive at (15). In 
appendix 7, I give a geometrical picture that leads to (15) and its higher dimensional 
generalization, to which we now turn. 


Higher dimensional Katuza-Klein and Yang-Mills theories 


Back in 1921, only the gravitational and the electromagnetic interactions were known. It 
took decades before the strong and weak interactions were clearly recognized as such. Still 
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later, in the 1970s, the strong, electromagnetic, and weak interactions were all discovered to 
be described by the nonabelian gauge theory* written down by Yang and Mills in 1954, as I 
have already mentioned in chapter VIII.3 and at the start of this chapter. I also mentioned in 
chapter VIII.3 that, furthermore, these three nongravitational interactions can be unified 
into a single gauge theory. Although experimentalists have yet to verify this grand unified 
theory,’ many theorists have professed faith in its general structure. 

Some readers may not be familiar with Yang-Mills theory.® For you to follow the rest of 
this chapter, all I ask of you is to know that the familiar electromagnetic gauge potential A, 
is generalized to a bunch of potentials A”, where the index? a is a group index associated 
with the nonabelian gauge group and labels the generators of the group. For example, for 
the group SO(3), the index a ranges over 1, 2, and 3. Maxwell’s theory corresponds to the 
simplest possible case, in which the gauge group is U(1) and the index a can take on only 
one possible value (namely 1) and hence can be suppressed. 

As already remarked, Katuza’s idea can be extended to any dimension. We simply start 
with a higher dimensional Einstein-Hilbert action 


Sxx=+ i d*x/—G MS’? R(G) = i: d‘xd*~y./—G M&-?R(G) (17) 
with Myc the “true” mass scale of gravity. As explained earlier, the power of Myc¢ in 
(17) is fixed by dimensional analysis. The internal space (with coordinates x°, x°, ---, x4, 


which we also refer to as ys) is compactified into a sphere rather than a circle, or more 
generally, a closed curved space characterized by some length a. As before, evaluating Sxx 
with a Gyy(x, y) of the form G,,,(x, y) = gyy(*), Gyj(x, y) = 0, and G;;(x, y) = 1 (with 
i, j =5,6,---,d), we recover the familiar Planck mass 
d—2 n+2 
Mp = Mrg"Va—4 = Mr¢°V, 


n 


(18) 


with V,, the volume of the n-dimensional internal space, thus generalizing the earlier 
relation M? = 2naM2. 

Perhaps not surprisingly, just as the Maxwell action pops out of Katuza-Klein theory, 
the Yang-Mills action also pops out.? (Thus, in this context, you don’t even have to know 
what the Yang-Mills action is. You could derive it from the Katuza-Klein action.) To me, 
that is the truly beautiful feature of Katuza-Klein!” theory. In appendix 8, I show how the 
Yang-Mills field strength emerges algebraically and geometrically. 

Our experience with the 5-dimensional example, in which G,,5 is identified with A,,, 
suggests that 4(d — 4) components G,,;(x, y), j = 5, 6,-+-,d, must end up being born 
in (3 + 1)-dimensional spacetime as the Yang-Mills gauge potential A‘. 

The issue here is how to relate G,,;(x, y) to A%(x). Both objects carry the index jy. But 
the indices j and a, carried by one but not the other, are manifestly different beasts. We 
have to find a way to connect them, using some mathematical entity that carries both j 
and a. 


* In this terminology, Maxwell’s theory is known as an abelian gauge theory. 
¥ Not to be confused with the radius of the Katuza-Klein circles of course! 
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Now your hard work learning about isometry and Killing vectors in chapter IX.6 pays 
off. Consider, for example, the sphere S?. The isometry group, the group that leaves S? 
invariant, is SO (3), the group of rotations in 3-dimensional space. On S?, we have 3 Killing 
vectors €,, with a = 1, 2, 3, associated with the 3 generators. Indeed, they were explicitly 
displayed in chapter IX.6. Write the Killing vectors out in components €,; = g jx€ * (with 
g jx the metric on the sphere). Since the vectors é, live on S2, the index j takes on two 
values j = 1, 2, corresponding to the two coordinates on S?. 

We see that, in general, the Killing vectors €,; are precisely what we need: they carry 
both the j index and the a index. Given these 3 entities, the off-diagonal components of 
the higher dimensional metric G,,;(x, y), the Yang-Mills gauge potential AU); and the 
Killing vectors €,;(y), the only relation we can write down is 


G yj, ¥) = Ca DAL) = 8 jROEEALX) (19) 


The indices have to hang together right, and this places a powerful constraint on what is 
possible. Another example of symmetry considerations saving us a lot of work! We will see 
in appendix 8 that this relation is indeed correct. 


Road signs to higher dimensions 


Two major concepts lie at the foundation of modern physics: local coordinate invariance 
and local internal or gauge symmetry. The former leads to the theory of gravity; the latter 
leads to the gauge theory of strong, weak, and electromagnetic interactions. 

The remarkable discovery* of Katuza is that if we suppose that spacetime is embedded in 
a space with dimension higher than four, these two concepts may not be independent—the 
latter may be derived from the former. The physics is so astonishing and the mathematics 
so elegant that many theoretical physicists might be disappointed if Nature does not use 
the Katuza mechanism at some level. Nature, please do not disappoint us! 

Furthermore, string theory, the leading candidate theory for unifying all four fundamen- 
tal interactions, can be consistently formulated only in higher dimensional spacetime and 
thus requires the Katuza mechanism. A priori, this need not be; if somebody had writ- 
ten down a consistent theory of quantum gravity incorporating the known interactions in 
4-dimensional spacetime, Katuza-Klein theory would have disappeared into the dustbin 
of physics history. Also, if Katuza-Klein theory could not incorporate Yang-Mills theory 
naturally, it also would have been kissed bye-bye. (See below, however.) 

I mention in passing a historical curiosity. In chapter IX.5, we saw that we could follow 
either Nordstrém or Einstein to a relativistic theory of gravity. Interestingly, both roads 
lead to higher dimensions. 


* In textbooks, theoretical physics is laid out, all polished and beautiful, as if it were almost logically inevitable. 
To counter this, I might mention that in his paper, Katuza mentioned the possibility of his theory explaining 
terrestrial magnetism. 
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Idea for cartoon: Theoretical physicist driving down a highway and seeing an exit sign 
for “Higher Dimensions.” 

Recall that Nordstrém described the gravitational field by a scalar field ®. In a brilliant in- 
sight, he noticed" in 1914 that gravity and electromagnetism can be unified in the (4 + 1)- 
dimensional Maxwell action S = f d°x(—4FyyF™%), M, N =0, 1, 2,3,5. Let y=x° 
describe a circle with radius a, and let A,,(x, y) = A,(x) and As(x, y) = ®(x) depend only 
on the (3 + 1)-dimensional coordinates x, so that F,,, = 0,,A, — 0,A,, is the usual electro- 
magnetic field and F,,5 = 9,,As = 4,,®. The action becomes S$ =a f d4x3(—Fyy FM + 
59, Pd" ®), Gravity and electromagnetism are unified under false pretense. 

The moral of the story is that in theoretical physics, simplicity may not be all that it’s 
cracked up to be. 


Some negative notes 


Considering how fundamental invariances!” under (1) and (2) are to theoretical physics, it 
would be disappointing indeed if Nature does not make use of this beautifully simple way 
of unifying these two equations. Unhappily, higher dimensional theories currently being 
worked on typically come with their own sets of Yang-Mills gauge fields. For example, string 
theory already contains gauge fields among the vibrational modes of the string. Thus, the 
gauge fields produced by the Katuza-Klein mechanism are not needed. If this should turn 
out to be correct, it would appear that Nature is “unreasonably” wasteful. 

On this somewhat negative note, let me mention that we have avoided talking about 
the mechanism for compactifying the extra dimensions. What would have been the sim- 
plest (and cleanest) possibility, namely that gravity supplemented by a cosmological con- 
stant could do the job, turns out not to work. Indeed, look at the field equation Ryy = 
5eun(R — A) and demand a flat spacetime, that is, R,,,, = 0. But this implies R = A and 
by the field equation, R;; = 0. The internal space is Ricci flat and cannot curl up into a 
sphere. Numerous mechanisms!? have been proposed, none compelling, involving the 
introduction of additional fields and using the energy momentum tensor of these fields to 
curl up the internal space. Some of these additional fields are in fact gauge fields or their 
generalizations. 

I might as well mention another difficulty that diminishes interest in Katuza-Klein 
theory, at least as traditionally formulated. (The following remarks are well beyond the 
scope of this book and are intended only for readers with at least a nodding acquaintance 
with particle physics.) In the so-called standard model of particle physics, fundamental 
fermions (namely the quarks and leptons) appear as left handed and right handed fields 
in unequal numbers. This fundamental lack of symmetry between left and right goes back 
to parity violation in the weak interaction. As should be intuitively!* clear, however, if the 
internal manifolds are simple spheres, then the resulting Katuza-Klein theory can hardly 
distinguish between left and right. 

This chapter contains a large number of appendices, many of which can be skipped upon 
first reading. Here is a list of the topics addressed: (1) a calculation of the 5-dimensional 
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action using differential forms, verifying that it contains the Einstein-Hilbert and Maxwell 
actions; (2) symmetry arguments and the role of the dilaton ¢ in the action; (3) the Jordan 
frame versus the Einstein frame; (4) the charged scalar field; (5) the natural emergence 
of the Yang-Mills gauge potential; (6) the Katuza-Klein metric viewed as foliation; (7) the 
Katuza-Klein metric in the vielbein formalism; (8) a more geometrical view of Katuza-Klein 
theory; (9) a glimpse of the Arnowitt-Deser-Misner formalism; and (10) some historical 


tidbits. 


Appendix 1: Einstein-Hilbert contains Maxwell 


As promised, we now calculate the 5-dimensional scalar curvature R(G) in (17) for the metric 
2 
ds* = g,,,dx"dx" + (dy + 1A,,dx") (20) 


In this appendix, we absorb / into A, for ease of writing. 

From general considerations, we have already argued in the text that R(G) = R(g) + ¢F""F,,, for some 
numerical constant ¢. You are invited to proceed by brute force, evaluating the Christoffel symbols, the 
Ricci tensor, and the scalar curvature, so as to determine ¢. In fact, as already mentioned in the text, to 
determine this numerical coefficient, you can ease your labor by calculating R(G) for the special case of 
Suv = "uv: 

Here, differential forms are presented with an opportunity to rise and shine. So instead of Christoffel symbols 
and the rest, here I will use differential forms. For the sake of pedagogy, and in a departure from my usual 
philosophy, I will actually calculate R(G) for the metric (20) in its full glory, with a general 4-dimensional metric, 
even though we could simply set it to the Minkowskian metric. As we will soon see, though, the hard part is not 
getting R(g), but getting F“” F,,, 

For convenience, write R = R(g) and indicate quantities associated with the 4-dimensional metric with a tilde. 
Start with the 5-dimensional 1-forms e4 = (e%, e>) given by e® = e%dx" = & ande> =dy 4 A,dx" = dy + Age”, 


where the last step defines Ay. For convenience we also define 4, by dx“d,, = e%,. The 5-dimensional metric 
is then given as usual by 


A 5 45 
Gyn= napenen = nape eh + even (21) 


(Note in passing that we have ef! = é7, ef = 0, é, A, = Agel, and e2=1) 


Now that we have set things up, let’s use Cartan’s first structural equation 


de“ + a,e8 =0 (22) 
to determine the connection 1-forms wp 
First, we have 
de =d (dy + Age?) =0 + IAgete® = 3 (IyAp — dpAg) ete? = 3 Fype%eh (23) 
From —de> = wo, e%, we obtain wo, = 3 Fy pe’. 
Next, €* = e” and so we obtain 
dé Oyo" Oyeh de“ = w%,e4 wwgeh + we? = (0%, + 4 F%e°) ef (24) 


Hence os = o% = 3F ae Note that because we absorbed /, A is dimensionless and F has dimensions of 1/L, 
consistent with w being dimensionless. 
Now plug in Cartan’s second structural equation 


A A é: 
R4, = do’, + 005 (25) 
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So we have 


R= - do, + ae + oe 
+ : = ~~ (a, - ie + (-}F%e") (4 Fpse*) (26) 


The tilde terms collect themselves nicely into R%. Also rather nicely, we note that to calculate the scalar 


curvature, we can ignore terms involving e”e°. After cleaning up a bit, we obtain 


RY = RY — EFSF, se¥e° — § (FS Fs — F4Fpy) e”e' + (e¥e§ terms) (27) 
Equating this to 
R% =1R% ee? a5 IR® eV 4+ 4R%. Pe” 28 
p= 28 bys By5 2 psy ee) 
we obtain 
D 5 1 
Rays = Ripys — 2 phys — 4 (Fs Fas = FSF py) (29) 


Next, using (25) again, we evaluate 


= 3 (8)F ap) ee? — FF api%ge” — 3 Fay Gye” — 3 Fpy Paeve” i 
Equating this to 

R, = ZR ricpere? = eae e + 3 -2R*, ee” (31) 

we can determine R°, ys and R>, . Note that to calculate the scalar curvature, we don’t give two hoots about 


Ry y- Rather, we ee to extract Bg from (30). But since @ does not involve e°, we deduce that only the last 
term in (30) contributes. We thus obtain 


5 1 
R aSy = ones (32) 


We are now ready to calculate the Ricci tensors needed to obtain the scalar curvature. Using (29) and (32), we 
find 


nea 5 1 
Rgs = R’pys + Rgss = Reps — 7 Fp Fas eal 


Next, using (32), we find 


Rss = R45 = 9°? Rosas = Repsq = NPR gsy = EN Eyal 'g = gyal? (34) 


Finally, putting (33) and (34) together, we obtain the scalar curvature 


1 1 - 1 a 
5 Pe Fag + FO Pap = R Pe Pap = R ra (35) 


R= nf? Res + Rss =R 


Or, using various other notations, we have R(G) = R(g) — tPF, or RO = RY — 1 qr Fay, as promised. 
All this for the — i! 

Note that, in spite of what we said in the beginning of this appendix, we would have gained only marginally 
in the computation if we had set ® = 0 to start with. 
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Appendix 2: The dilaton or radion in the action 


We have left out the scalar field (x) thus far, but now let us consider ds? = Bpydx"dx” + b(x)(dy + A,dx")?. 
(We absorb a into @, just as we had absorbed / into A.) As I have mentioned on more than one occasion, we could 
always proceed by brute force, simply calculating the scalar curvature R(G) with this metric. But since we prefer 
not to sweat, let us see how far we can get with symmetry considerations and the knowledge that two powers of 
spacetime derivatives must appear. 

The scalar curvature is invariant under 


ax’? ax’? 
ax™ axN 


Gun (X) = Gig (XY (36) 
Consider the coordinate transformation x’! = x” and x = f(x°), that is, we do not touch the usual spacetime 


coordinates. Then Gs5(X) = 6(x)* = Gi,(X’) Be o = g(x PX a7 Since we require $(x) and ¢’(x’) to be 


independent of y, for the transformation to be allowed, a can only be equal to a constant K. Hence ¢$(x) = 
K@'(x') = K@'(x). Next, setting (M, N) = (uw, 5) in (36) and recalling that G5 = $7 A, we see that A,,(x) = 
K ae. (x). This coordinate transformation corresponds to multiplying ¢ and dividing A,, by a constant K. 
Furthermore, G,,, and hence g,,, remain unchanged. Thus, coordinate invariance requires that the Maxwell 
term must now appear in the combination ¢?F fry cia 

Let us now find those terms in the scalar curvature R(G) containing ¢ and two spacetime derivatives. For 
now, set g,,, =",y and A,, = 0. Invariance under $ (x) = K $’(x) indicates that there are only two possible terms: 
nt, d,6/ and n¥”d,,P0,6/ ¢*. To determine the numerical coefficients of these two terms, we simply have to 
calculate the scalar curvature for ds? = Nyvdx"dx” + ¢’dy*. What an easy calculation! You could have done it 
way back when you first saw the Riemann curvature tensor. At this point, the diligent reader with a good memory 
jumps up and exclaims, “I did do it as an exercise back in chapter VI.1! The answer was R = —2n!""0,,0,6/.” 

For curved spacetime, we simply promote 7” 4,0, (the flat space d’Alembertian of ¢) to 


1 
¢ = —A4,,(J—gg"'"9,) 
re 


(the curved space d’Alembertian of #). We thus obtain 


Le 


R(G) = R(g) — $¢° Fy PF — a (37) 


where all contraction on the right hand side is done with g,,,. 


Appendix 3: Who frames the action? 


Recalling that /—G = ./—g¢, we can now use (37) to write the Katuza-Klein action Sy, = f d+xdy./—G M2R(G) 
as 


Sjordan = Mp / d*x./=g (x — 36° Fy FM” — a6) (38) 


This is known as the action in the Jordan frame after Pascual Jordan, one of the founders of quantum mechanics 
and quantum field theory. We could perfectly well work with this action, expanding the field @ around 1 and 
treating the deviation from 1 as a small fluctuation. 

But we can also elect to get rid of the ¢ in front of the scalar curvature if we like. How can we do that? That 
question is actually a memory test for you. Do you remember? 

Way way back in exercise 1.6.10, you learned that two spaces described by the metric g,,,, and g,,, are said 
to be conformally related if g,,,(x) = 2(x) guy (X). Furthermore, you worked out (please don’t tell me that you 
didn’t!) in exercise VI.1.13 that the scalar curvatures in the two spaces (with dimension d = 4) are related by 
R=2-7R — 6Q-3QQ, with [the curved space d’Alembertian constructed using 8,y- Surely you remember that 
now? This shows that with a judicious choice of Q, we can recover the Einstein-Hilbert action without a @ in 
front. 
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So, for notational convenience, let us now puta tilde on the metric in the Jordan frame. In other words, Katuza- 
Klein theory with the dilaton field ¢ has given us the action Sjordan = Me f d*x,/—8o(R + $7 Fy Figg? gr? — 
2Fp/#), with B evidently the d’Alembertian constructed with 2 us 

Now set 2, = or ae so that /—% = /—g0* and g”” = Q-*g"”. Thus, we have 


J—26R = J=82'g(27R = 6g!” 2¥9,,9,2) 


If we set Q? = 1/¢, we recover the Einstein-Hilbert action at the cost of generating some more scalar terms. In 
fact, we obtain 


V-80R = /=8 (R+ 0-6 - 36-72" 9,.68,0) (39) 


For future reference, we now have \/—% = ./—g/¢? and g"” = dg”. 
We turn next to the Maxwell term: 


V=80 (07 Fray Fro 88") = V=8b 70 (8? Fray Fo 88" b) = V=8G Fy FM 


(Note that the raising of indices in F“” is now done with the metric g”.) 

Recall that in the two preceding appendices, we absorbed the two Katuza-Klein lengths a and/. Now put them 
back by scaling # + ¢/a and A,, — /A,,. Putting all this together, we obtain (using Mp/ = 1) the action in the 
so-called Einstein frame: 


Stinstein = / d*x/=g (MpR — fa7*9°F,,)F!” + M3 ($116 — 36-20""9,,0,0)) (40) 


The multiplicative invariance ¢ > K@ in the preceding appendix clearly suggests a change of variable 
¢ = Ce’”, with ¢ and A some real numbers. We now have an additive invariance under y > y + constant. By 
invariance (or simple differentiation) we deduce that the two purely scalar terms in (40) can only become a linear 
combination of 0g and g””4,,p4,¢. The first term goes away upon integration by parts. Thus, we finally obtain 


Stinsein = f tx J=B (MRR — fe Fy FH — be""0,60,0) (41) 


(We have chosen ¢ =a and A = /p/¥/3, so that the kinetic energy term for y and the Maxwell term for g = 0 have 
their standard forms.) 


Appendix 4: Charged scalar field 


As promised in the text, here we study a charged scalar field ® with mass m in (4 + 1) dimensions more carefully. 
Its action is given by 


s=[ a's [ ayv=G (-ayb1G"% ay — moto) (42) 


For simplicity, we will let the (3 + 1)-dimensional spacetime be flat and set g,,, = ,,). To evaluate this action, we 
first have to invert the matrix (9) to obtain 


GMN _ ( nt? —1At ) (43) 
-IAY 142A? 

where A? = A, A" = "A, A,. (Compare with (16).) Let us now specialize to the particular mode ®(x, y) = 

gy, (x)e'””/* and insert this into (42). Integrating trivially over y, using the relation / = ea (and suppressing the 


subscript n on the complex scalar field y, which evidently should not be confused with the real scalar field y in 
the preceding appendix), we obtain 


S=2na i d*x C (a, + ineA,,) gin” (a, — ineA,) g — mpl) (44) 
Those readers familiar with the action of a complex scalar field in the presence of an electromagnetic field will 


see that gy describes a particle with charge ne and mass m,, given by m? =m? + (n/a)?. This makes precise the 
discussion in the text. 
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Appendix 5: Emergence of Yang-Mills structure 


After Katuza proposed the hidden internal space to be a circle, it seems glaringly obvious, at least in hindsight, 
that we should consider the sphere next, and that was exactly what Klein did. As mentioned in the text, Yang- 
Mills structure naturally emerges from Katuza-Klein theory. Indeed, we could already have guessed from the 
index structure alone how the construction would go. 

The discussion here can be couched in fairly general terms, but for pedagogical clarity and definiteness, the 
reader seeing this for the first time should imagine the internal space as the sphere S?. Other readers might want 
to think of an arbitrary coset manifold G/H, as explained in appendix 1 to chapter IX.6. As in the text, divide the 
coordinates: X” = (x/, y'). The letter x when unspecified will refer to x” only. 

Generalize ds? = Byydx"dx® + (dy + 1A,dx")? in (8) to 


ds? = gyy(x)dx!dx” + g;(y) (ay' + Aa (x)E\(y)dx") (ay! + Abies (y)dx") (45) 


(with / = 1 for maximal clarity). The internal space is coordinatized by y* and has Killing vectors ¢*(y) indexed 
by a. 

Start by remembering how isometry works in the absence of the gauge potentials. So set A‘ to 0. The 
transformation y! > y! + Atel (y) (with A® infinitesimal constants) is supposed to leave g; jdyidy! invariant. 
Let us now verify this. We have dy! > dy! + A@ agidy’, or in other words, 6(dy') = A@ acidy’, Then we obtain 


3 (sijay'dy’) =A‘ (ka sijdy'dy! + g:j,€idy*dy! + ij xé/dy'dy*) (46) 


This vanishes if the Killing equation (IX.6.2) ea | Bj aic* + guid jek = 0 holds. Indeed, this provides an 
alternative (but essentially the same, of course) derivation of the Killing equation. 

Now turn on the gauge potentials, so that dy! is replaced by Dy! = dy! + AGg dx", Consider the transfor- 
mation y! > y! + A@(x)é ! (y), with the infinitesimals A“(x) allowed to depend on x. We want to know how the 
gauge potentials At should transform for g;; Dy' Dy/ to remain unchanged. 


Let us split the calculation of the change 5(Dy’) into two pieces: 


3 (ay') ag (ave!) = AM aéidy* + E19, Mdx" (47) 
and 
3 (Asgiaxt) = (542) Eldxlt + AG MEA, eldx# (48) 


Evidently, it is important to keep in mind that A“ and A‘, depend only on x, while ¢ ! depends only on y. Our job 
is to determine Al, by requiring 5(g;; Dy' Dy!) =0. 

One clue is that we must use our knowledge that the isometries form a group, namely Lie’s equation, as 
discussed in chapter IX.6. There we worked out the Killing vectors for the sphere S$ 2 = §0(3)/SO(2) and showed 
that they satisfy [€,, €] = €,5€-. When written out in components, this means, in the notation used here, 


a ee 49 
Jk b gyk ~ “abe ( ) 


The preceding exercise gives us another clue: thanks to the Killing equation, the requirement 5(g;; Dy' Dy!) = 
0 will be satisfied if 5(Dy') = A?,¢} Dy*. Staring at (47) and (48), we see that we want 5 Af to contain two pieces. 
First, we need a piece —d,A4 to cancel off the second term in (47). This is more or less expected, since the 
electromagnetic gauge potential transforms similarly: 5A,, = —d,,A. Second, we need another piece to convert 
the combination ek Ope : in the second term in (48) to é it EF, that is, to interchange the indices a and b. How can 
you do that? Think for a moment before reading on. You are about to discover Yang-Mills theory. 

Of course, I have set it all up for you. Come on, use (49)! Write it in the form cea, + &4p: : — rae We see 
that we need 

bAC =e 

i 


pak Sah (50) 
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"y 


Figure 2 Construct our spacetime by piling sheets 
on top of one another, with each sheet labeled by 
x", Within each sheet, points are located by the 
internal coordinates y’. 


Plug this in and watch the right hand side of (48) become —€/0,, ACdx" + AG Deh ae dxt. Hence 5(Dy') = 
A> a,é} Dy®, as desired. 

Just as the transformation (1) fixes the electromagnetic field strength F,,,, as was discussed back in chap- 
ter IV.2, we expect the transformation (50) to fix the Yang-Mills field strength F’,, (which in fact carries, as you 
could have guessed, an extra group index a compared to the electromagnetic field strength). Given F ny (See 
appendix 8), the action follows almost immediately.'® 

For the record, from (45), we can read off 


Gi; = Sij» Gyj = At Siege Guy = Suv + (siiciei) At Ay (51) 


Appendix 6: Katuza-Klein as foliation 


I now give!” a geometrical and pictorial derivation of the Katuza-Klein metric. In general, we have 


= Guy Gyj ) 
x, y= 52 
Gun (, y) ( Gan, Ge (52) 
Our goal here is to identify the components of Gy in terms of the metric g,,, of the space we live in and the 
metric g;; of the internal space. 

Think of a patch of the internal space as a sheet. Picture our spacetime constructed by piling sheets on top of 
one another (see figure 2). The sheets are labeled and distinguished by x”. Within each sheet, points are located 
by the coordinates y’. 

Infinitesimal displacements characterized by Sy! lie completely within a given sheet (that is, 6x“ = 0: no 
translating in the space we live in). The distance squared is then ds* = Gyy5X™6XN = Gij dy'dy/. This means, 
somewhat trivially, that the metric for the internal space is given by 


8ij = Gi; (53) 


Less trivially, suppose we want to translate purely in spacetime. By “purely,” we mean that the translation is to 
be perpendicular to the internal space. But we need to be careful about what we mean by “perpendicular.” In fact, 
as we will see presently, a displacement perpendicular to the internal space will involve displacing in y as well. 
Roughly speaking, the y coordinate markings in the sheet labeled by x + 5x will not in general be lined up with 
the y coordinate markings in the sheet labeled by x. In other words, the desired displacement 5X M — (Sx, by!) 
must be perpendicular to any infinitesimal internal displacement* 5’X™ = (0, 5’y'). With the invariant definition 
of “perpendicular,” this means that 6X G yy (0, 5’y)" =5X“Gyj6'y/ = 0. Since 4’y is arbitrary, it follows that 


0 = bx"G,,; + by'Gij (54) 


* Here 6’ is to be thought of as another Greek letter. The prime is not an operation on 6. In other words, 5’y 
is just some arbitrary infinitesimal change in y different from dy (like 6,x and 52x in chapter VI.1). We want to 
find the restriction perpendicularity imposes on 5X™. 


690 | X. Gravity Past, Present, and Future 


We can solve for 
dy! = —gG 5x" (55) 


where, importantly, g/ is the inverse of g,;; = G;,;, not the ij component of G”", the inverse of Gy, y. (Got that? 
P' Y & Sij J P MN 


ip 
Now that we have determined a displacement in spacetime 5X M — (6x4, —giG jpox"), we can calculate its 
length squared: 


Gynd XIX" = Gy, 8X" 5x” = (Gyn dx" zs Gindy’) 5x” = (Gus z Gins!G jv) 5x! 5x” (56) 


(The first equality holds since by construction, Gyj;5X™édy/ = 0 for any dy. In the third expression, dy! is 
the specific infinitesimal change determined in (55).) In other words, we can identify the spacetime metric as 
Suv = Gyy Gig! G ;,. Note that it is not just G,,, (as poor Confusio might have thought). 

Collecting, we can finally write 


(Bs ( + cae Guj ) (57) 
iv ij 
We now understand the geometrical origin of the Katuza-Klein form in (9) and (15)! Needless to say, this is also 
entirely consistent with (51). 
At this point, our friend the Jargon Guy interjects and tells us that we have been describing foliation. Thanks, 
Jargon Guy! 


Appendix 7: The Katuza-Klein metric in the vielbein formalism 


Here we derive the Katuza-Klein metric (9) and (57) for arbitrary dimension using the vielbein formalism of 
chapter IX.7. Denote the vielbein for the extended spacetime by eM Here M = (uw, i), and A = (a, a). The indices 
wand @ run over 0, 1, 2, 3, and the indices i and a run over 5, --- , d. The metric is given by Gyy = NAaBener- 
The world index M is contracted with Gy y, and the Lorentz index A with the extended Minkowski metric 74 3, 
containing the usual Minkowski metric yg and nq» = 5gp- (This means that the indices A = (@, a) can be raised 
and lowered at will (up to a sign).) For example, e!” = n,ge?” = n4gG™ Nes, with GY the inverse of Gyy. 
The discussion in this appendix complements that in the preceding appendix to some extent. 
For some arbitrary 54.4, we displace X™ in the Ath direction by 


5X™ = 5,04" (58) 


We say that 5, generates a displacement in the internal space, with 5x“ = 0 by definition, thus implying 


eH =O0> eh =0 (59) 
We then have 
|e i 
M ey ea 
f= . 60 
. ( 0 é ) (60) 


Note that this introduces an asymmetry between external and internal spaces. 


The orthonormality of the vielbein e4“e3, = 48 implies 0 =n? = eter belo? = etieh 


used (59). Thus we have 


= 0, where we have 


ef =0>e,,=0 (61) 
This gives 
Ot 
A en 0 
eu 62 
1-(4 2) a 


In what follows, we evaluate the metric Gyy = napeyen Using (59) and (61) repeatedly. 
First, we obtain 


Gij= napesel + navese = naveses = 8ij (63) 


vs F F = b 
The last step amounts to the definition of the internal metric g;; = napefe i 
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Next, introducing the notation N;,, = G;, and N,; = G,;, we have 


b b 
Ni = Niy == Giy = Naper ee t Nabe; ey = Nabe; ey = ef lay (64) 


iv 


Since (59) implies cies = eM es, = dp, we can write eg, = e Niy- 
Finally, we have 


Guy = Napenee t naveney, = Suv t 1 Cauley 
bai i 
= 8 +n" een NinN jv = 8pv + Nin! Njv (65) 


In the second equality, we defined g,,,, = nup oe ; in the third equality, we used the result of the previous step; 


and in the fourth equality, we defined g!/ = ne! 


ij component of G”, namely G‘/. (If this does not sound familiar, read the preceding appendix again.) Let’s 


; It is worth emphasizing that g‘/ is the inverse of g; j, not the 


#3. gtd Jyh tot eh at dsc _ si 
check this: gY gj, = 1° e( e,Meaejey = Me Nedey Oh = Sk: 
So in summary, we have 


Suv + Ning!" Nay |) 


(66) 
N; ij 


Gun = Gui = ( 
We have obtained once again the Katuza-Klein metric (57) (and (9)), with g,,, = Napenee : 

The form of the metric in (66) generalizes an expression'® found in the celebrated textbook by Misner, Thorne, 
and Wheeler, described there as “pushing forward the many fingers of Time.” I don’t know about you, but as an 
undergraduate studying with Wheeler, I had considerable difficulty in picturing the many fingers of Time. Here 
we derived (66) by following our nose. At this point, our friend the Jargon Guy kindly informs us that various 
quantities in (66) have names like “lapse” and “shift.” 

Again, for the record, the inverse of Gyyy is given by 


: vp —g’N Ik 
Cee re ( ns op gik a ee Ik ) Of 
—§ Nis gr +s Ninv& Nio& 


where g”? is the inverse of g,,). 


Appendix 8: A more geometrical view of Katuza-Klein theory 
and emergence of Yang-Mills structure 


As is made clear by the discussion in the text, we could simply insert G,,;(x, y) = g iROE® (yA, (x) into the 
higher dimensional Einstein-Hilbert action and watch the Yang-Mills action emerge. Just plug and chug. (The 
calculation is similar to, but more involved than, the calculation we trudged through in appendix 1. At some 
point, you clearly have to use the properties of the Killing vectors.) Since this is readily worked out (and also 
available in a number of places), I elect to follow a rather different, and more geometrical, approach. 

Let us contemplate the following question. Given Gy in (66), what must it satisfy for us to be able to bring 
it to the block diagonal form 


‘ x 0 
ce °) (68) 


by a coordinate transformation? In other words, under what condition do the internal and external geometries 
decouple? 

Here we are inspired by the electromagnetic case. Back in (15), the internal and external geometries decouple 
if A,, = 0. But this is not a gauge invariant statement; the correct condition’ is that F,,,, = 0. Thus, we expect that 
the condition for the internal and external geometries to decouple is the vanishing of the analog of F,,,,, and we 
can identify that object as the Yang-Mills field strength. 


* If you are ever asked the question, “What is the electromagnetic field?” you can answer that within Katuza- 
Klein theory, it is that which links the internal and external geometries. 
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We want to preserve (61) stating that e” = 0 under the desired coordinate transformation. As before, we use 
the notation X” = (x", y/), that is, with the internal coordinates denoted by y. Recall from chapter IX.7 how the 
vielbein transforms: 


ax” , 


n(x! )= = ox'n 


ey (X) (69) 


: axl! <p 
Hence we require 0 = e/*(X’) = axe re (X) = ay en (X). From (59), we have e4 eA = = ey = = on. Thus, multiplying 
by e?, we see that five preceding requirement implies that we should restrict ineenes to those coordinate 
transformations in which x” does not depend on y". 

See if you can work out the condition for decoupling before reading further. 


r ax? ax? 
From Gy (X') = 2G 2G Gpg(X), we have 


OX’ AxX'N 
» on OX? ax? dy* ( ax” _ ay! 
i ae ax! ayli PENS ays \ax'e Cox 4 ax/# Cu (70) 


since x does not depend on y’/. 
In what follows, we must keep an eagle eye out for what variables are held fixed in the various partial derivatives 


(as when doing thermodynamics). Clearly, the quantity a in (70) is evaluated with y’* held fixed. Instead of 
thinking of it as a function of x” and y’*, . is more ponvemient for our purposes to think of it as a function of x” 


and y’*: x! = x!(x”, y’*). So in (70), write _ as cs re ee Then we obtain 
ay* ax? ay! 
Ge y)= 2 @ + cn) (71) 
ylk 


Bd dy) Ox’ ax” 
Now we impose the stated form (68) and demand that Gi, i - ) = 0. Assuming that the matrices a and 


je are nonsingular, we peel them off in (70) to obtain G,, + 


Gy, = N,, and Gj, = 8), we obtain 


or = lye G;, = 0. Using this, and recognizing that 


_ ay! 


=N,,¢"' =N! 72 
ax yt vk& vp ( ) 


(where, as before, g’ is the inverse of g),). Differentiating (72), we have 


1 
1 3 N, 


ve axl 


an! 
yk YF [cu 


a 
yk OXF | ik 


dy! 
yi " axk 


ay! 
OxHOxY 


aN! aN! 
= y J v 
axe # Oyd 


(73) 


In this final form, we regard N : as a function of x“ and y/, as indeed it is. So let’s give this expression a name: 


Fl as yom 74 
Hu gg Na ggg? ORY m4) 


The condition for the geometries to decouple is then simply F i =0. 

As I explained, we should identify the expression for F , in (74) as the analog of the electromagnetic field 
strength. The reader who has studied nonabelian gauge theory will recognize that this almost looks like the Yang- 
Mills field strength. It is certainly pleasing to see something like that emerging from considerations of geometries 
decoupling. Nevertheless, it is clear that we should not yet identify F : , as the desired field strength. After all, 


1 : : : : 
Fy Carries only geometrical indices. We have to connect geometry to algebra. 


As explained in the text, the Killing vectors do precisely that. So write! 
Ni = Ate (75) 
and then (74) becomes 


aA? ag! 
1 v gl iA b = >b 
Fey aay Ge AUcIA » dyi (uv) (76) 


Using Lie’s equation (49), we obtain 


Fly = (846 — 89,404? — (uv) Oh Feel (77) 


The Yang-Mills field strength F’{,, emerges naturally in the Kaluza-Klein framework. 


X.1. Katuza, Klein, and the Flowering of Higher Dimensions | 693 


Figure 3 The dynamics of geometry: an instant in time is represented by 
a spacelike 3-dimensional hypersurface in curved spacetime. 


Appendix 9: The dynamics of geometry and the ADM formulation 


This may be a good place to mention an important subject, namely the Arnowitt-Deser-Misner (known as ADM) 
formulation of gravitational dynamics, even though it is not directly related to Katuza-Klein theory. In physics, 
you are used to specifying a dynamical system at some initial time and then asking how it evolves in time. 
In general relativity, time t¢ is one of the coordinates, and an instant in time is represented by a spacelike 3- 
dimensional hypersurface in spacetime, in general curved. Thus, we are to specify the 3-dimensional geometry 
on an initial spacelike hypersurface specified by t equal to some constant and then to follow the dynamics of the 
geometry—what Wheeler called geometrodynamics*—as we move from one hypersurface to another. 

Well, you might have noticed that the sheets in figure 2 can represent these hypersurfaces. We flip a switch in 
our brains and rename (see figure 3) the “internal coordinates” y' as the spatial coordinates x! and the spacetime 
coordinates x” as the single time coordinate t. We can immediately take over the metric in (66) and write 


-N?+NigiN; Nj 
Suv = 


N; ij 


(78) 


where N? = — goo and N; = go, with N known as lapse and N; as shift. Pictorially, the shift measures how one 
hypersurface is not lined up with the next. (Now the phrase “pushing forward the many fingers of Time” probably 
makes sense to you.) We now plug g,,, into the Einstein-Hilbert action, identify the conjugate momentum 
variables, and then pass from the action to a Hamiltonian formulation‘ of Einstein gravity 

This initial value formulation is important in the burgeoning field of numerical relativity. Another reason for 
its importance is that once we have the Hamiltonian, we can apply Heisenberg’s formalism to quantize gravity. 
The ADM Hamiltonian also gives us one way of defining energy. Thus, the positive energy theorem states that the 
ADM energy”® of an asymptotically flat (nonsingular) spacetime satisfying the field equation with a T,y obeying 
the dominant energy condition is not negative (recall chapter [X.3). This all important but vast subject of the 
ADM formulation lies way beyond the scope of an introductory text; I refer?! the reader to more specialized 
monographs. 


Appendix 10: Letters from Einstein to Katuza 1919-1925 


Theodor Katuza’s son made public”? the letters his father received from Einstein. I find it quite interesting to 
see how Einstein’s thinking evolved, particularly in comparing Katuza’s idea with Weyl’s idea. I give here a few 
excerpts with the corresponding dates. 

April 21, 1919: “The idea [of unifying electromagnetism with gravity] has also frequently and persistently 
haunted me. The idea, however, that this can be achieved through a five dimensional cylinder-world has never 
occurred to me and would seem to be altogether new. I like your idea at first sight very much. From a physical 
point of view it appears to me more promising than the mathematically so penetrating Ansatz of Weyl, because 


* Which, in my humble opinion, is a better name than Einstein gravity, and certainly far superior to the 
historical general relativity. 

+ Conceptually, this procedure is the same as, but far more sophisticated than, the passage from the Lagrangian 
L(q, q) to the Hamiltonian H(p, q) you learned in classical mechanics. 
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it concerns itself with the electric field and not with the, in my opinion, physically meaningless four-potential.” 
(The underlining is Einstein’s.) Einstein turned out to be wrong in that last sentence! He offered to present 
Katuza’s paper to the Berlin academy if it could be shortened to less than 8 printed pages (the limit imposed on 


nonmembers). 
April 28, 1919: Einstein started with “I have read through your paper and find it really interesting” but went 
on to state “the arguments . . . do not appear convincing enough.” He then suggested a further calculation to 


clear up a “question of the geodesic lines,” saying that “You must not be offended by this because if I present 
your work [to the academy] I am backing it up with my name.” He also mentioned that he knew the editor of the 
“newly founded Mathematische Zeitschrift” quite well and could get Katuza’s paper published there instead. 

May 29, 1919: “It is true that I made a blunder with [some remark Einstein made in a previous letter]. . . . I 
see that you thought the matter over quite carefully. I have great respect for the beauty and boldness of your idea. 
But you will understand that I cannot take side with it as originally planned given the present factual doubts.” 
Again, he offered to “put in a word” for Katuza with the editor of the Mathematische Zeitschrift. 

October 14, 1921: Notice that this letter was written more than 2 years after the first one. Interestingly, this 
letter carries the salutation “Most revered Dr. Katuza” instead of the “Dear colleague” used in the previous letters. 
The letter was direct and to the point. “I am having second thoughts about having restrained you from publishing 
your idea on a unification of gravitation and electricity two years ago. Your approach seems in any case to have 
more to it than the one by H. Weyl. If you wish I shall present your paper to the academy after all, provided you 
sent it to me.” 

February 27, 1925: More than 3 years later, Einstein wrote: “I am still of the opinion that your idea . . . is 
of great originality and merits the serious interest of academic colleagues. . . . 1 myself have so far struggled 
with this problem in vain. It often appears to me that the magnetic field of the earth is based upon an as yet 
unknown connection between gravitation and electromagnetism, but I cannot come out of the inconsistencies.” 
To a present-day theoretical physicist such as myself, that last sentence sounds rather astonishing, to say the least. 


Notes 


1. Perhaps Einstein’s reluctance was explained in a 1922 letter he wrote to Hermann Weyl, saying that “Katuza 
seems to me to have come closest to reality, even though he too fails to provide the singularity free electron” 
(quoted in J. van Dungen, Einstein’s Unification, Cambridge University Press, 2010, p. 134). Einstein’s dream 
of seeing the electron emerge as a solution of his field equation has not been realized and seems more remote 
than ever. The lesson here is not to demand too much of a promising theory. 

2. The scalar field ¢ introduced in chapter II.3 and more recently in chapter IX.5 satisfies the Klein-Gordon 
equation. Klein also anticipated something like the Yang-Mills field strength (which we will discuss briefly 
in appendix 5) in 1938. O. Klein, New Theories in Physics, International Institute of Intellectual Cooperation, 
League of Nations, 1938, pp. 77-93. 

3. Independently, the Russian physicist H. Mandel did the same in 1926. For this and some other historical 
tidbits, see the introduction in T. Appelquist, A. Chodos, and P. G. O. Freund, Modern Kaluza-Klein Theory, 
Addison-Wesley, 1987. 

4. Klein later said that it was Pauli who told him that Katuza had anticipated his work. 

5. The physics here is essentially the same as that in a wave guide. 

6. P. Jordan was the first to introduce this scalar field, but before he could return the proofs of his paper, the 
building housing the journal, Physikalische Zeitschrift, was bombed. Fortunately, a copy of the proofs was sent 
to Pauli, who showed them to Einstein and Bergmann. 

7. See QFT Nut, chapter VII.5-7. 

. See, for example, QFT Nut, chapter IV.5. 

9. As far as I know, this was first published in 1968 by R. Kerner from the University of Warsaw (Ann. Inst. H. 
Poincaré 9 (1968), p. 143). A complete and general derivation was first given in 1975 by Y. M. Cho and P. G. O. 
Freund, Phys. Rev. D 12 (1975), p. 1711. Earlier, it was given as homework problem number 77 in B. DeWitt’s 
1963 Les Houches lectures “Dynamical Theory of Groups and Fields.” A personal note: The volume of the 
proceedings of the 1963 Les Houches school (Relativity, Groups, and Topology, ed. C. DeWitt and B. DeWitt, 
Gordon and Breach, 1964) is in fact one of the first physics books I owned as a sophomore in college. (J. A. 
Wheeler probably gave me a copy; I couldn't possibly have had the wits or the means to buy this huge book 
with almost 1,000 pages.) I remember poring over Wheeler’s lectures “Geometrodynamics and the Issue of 
the Final State,” trying to make sense of the whole thing. 

10. One reason that Klein did not obtain the Yang-Mills structure earlier than he did was that he considered a 5, 

rather than a 6, dimensional theory. 


oo 


20. 


21. 


22. 


X.1. Katuza, Klein, and the Flowering of Higher Dimensions | 695 


. G. Nordstrém, Physikalische Zeitschrift 15 (1914), pp. 504-506. 

. As you have probably also heard, string theory can only be formulated in 10 or 11 dimensions, thus sparking 
a tidal wave of contemporary interest in higher dimensions. 

. See, for example, the reprint volume by T. Appelquist et al., Modern Kaluza-Klein Theory, cited in note 3. 

. E. Witten has rendered this conclusion mathematically precise with index theorems. 

. This is one of the reasons string theorists abandoned the spheres for the mathematically more sophisticated 
but physically less friendly Calabi-Yau manifolds. 

. See the first footnote in this chapter. 

. Adapted from A. Zee, “Grand Unification and Gravity,” in Grand Unified Theories and Related Topics, 
ed. M. Konuma and T. Maskawa, World Scientific, 1981, p. 143. 

. For example, C. W. Misner et al., Gravitation, p. 506. 

. It is important to note that, as was shown clearly in the step-by-step calculation, it is not N,,; but N i that 

appears here. Recall especially (72). Some authors even mistakenly write G/,, which is of course identically 0. 

The original Katuza 5 = 4 + 1 example emphasizes this point: N,5 = Gys = Aus but NP a Gus? =A,, 

since Gs5 = 6? = gs5 and hence g* = 1/¢”. 

In particular, this defines the ADM mass of an object. For a discussion of the different definitions of mass 

in general relativity, see N. O. Murchadha et al., arXiv: 0912.4001. 

See particularly Deserfest: A Celebration of the life and works of Stanley Deser, ed. J. T. Liu, M. J. Duff, and K. S. 

Stelle. 

Facsimiles and translations of the letters may be found in Unified Field Theories of More Than 4 Dimensions, 

ed. V. De Sabbata and E. Schmutzer, World Scientific, 1983. 


X.2 Brane Worlds and Large Extra Dimensions 


Escape and visibility 


The escape and visibility problems confront all those who want to have more than (3 + 1) 
dimensions. In Katuza-Klein theory, they are solved by supposing that the size of the higher 
dimensions is characterized by the Planck length. How else can you solve these problems? 

Another possibility is that all the particles we know and love are somehow nailed down 
to the (3 + 1)-dimensional spacetime we call home, which amounts merely to a slice! in 
a higher dimensional spacetime. The graviton, in contrast, roams over all of spacetime, 
since it has to do with fluctuations of the entire metric, not just the metric on the particular 
slice humans live on. 

You might think that somebody would have raised this possibility at some point after, 
say, 1920, but as far as I know, nobody did until recent times, at least not in print. 
This solution to the escape and visibility problems probably would have struck most 
theoretical physicists, until now, as rather contrived and ad hoc, with all but one of the 
particles forbidden to roam over all of spacetime. But, within string theory, this scenario 
occurs rather naturally, as was pointed out by Polchinski. In one realization, quarks, 
leptons, gauge bosons (such as the photon and the gluons), and so on are all described by 
open strings, while the graviton is described by a closed string. The (3 + 1)-dimensional 


spacetime we live in is known? 


as a 3-brane, to which the ends of the open string must be 
attached. A closed string, in contrast, has no ends to attach to anything. Consequently, the 
graviton is free to roam. Needless to say, this sketch hardly does justice to the glory and 
splendor of string theory. 

This scenario is known variously as large extra dimensions? or brane worlds. Let me say 
right off that the term “large extra dimensions” merely means that the extra dimensions 
are large compared to the Planck length. The extra dimensions are still tiny on the scale of 
everyday life (see below). So, no, you could not wander off into the extra dimensions as in 
some science fiction story. I might also mention that you do not need to know string theory 
to read this chapter; indeed, one could readily make up purely field theoretic mechanisms 
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for confining all particles except for the graviton to the (3 + 1)-dimensional slice of the 
larger spacetime. The brane world scenario is inspired by string theory but does not depend 
on it. 

We will start with some general considerations of gravity in a higher dimensional world 
rather than a specific brane world model. 


Newton’s inverse square law 


Way back in chapter II.1, we discussed the answer to the question physics students often 
ask (or fail to ask): why an inverse square law, and not inverse cube, say? The deep answer 
is that the power 2 follows from rotational invariance and the dimensions of space. 

The physical origin of the inverse square law goes back to Faraday and his flux picture. 
He was talking about electric flux, but it could just as well be gravitational flux. Consider 
a sphere of radius r surrounding a charge. There is a fixed amount of electric flux coming 
out of the charge. Since the area of the sphere is given by 47rr?, the flux going through the 
sphere per unit area varies like (41r)~! « 1/r?. That’s it, the inverse square law! We see 
that it comes from a geometrical fact about area and clearly depends on the dimensions 
of space. 

More formally, recall the discussion given in chapter II.1. Newton’s gravitational poten- 
tial V around a point mass M satisfies 


VV (x, y, 2) = 40 GMS (x, y, z) (1) 


The poor man solves this dimensionally (as we did in chapter II.1) by setting V? ~ 1/r? 
and 69)(x, y, z) ~ 1/r3, so that the equation becomes V/r? « 1/r?, thus giving V ~ 1/r 
and hence the inverse square force. A richer man, but not necessarily a rich man, would 
solve (1) by Fourier analysis, using the integral representation* of the delta function 


3 es . 
sPx,y, z= f Says e’* to obtain 


Via / Bett gl (2) 
kor 


By dimensional analysis, the integral has dimension of k?/k? ~k ~ 1/r. The inverse 
square law then follows. 

Note, once again, rotational invariance and the 3 dimensions of space, as reflected in 
the factor dk in (2). This is of course just a more sophisticated formulation of Faraday’s 
flux picture. 

A bit of digression for the benefit of some readers. I might mention that in quantum field 
theory, the potential V is given by” the Fourier transform of the Feynman amplitude for 
the exchange of a graviton between 2 external masses. Again, rotational invariance and the 
3 dimensions of space imply the inverse square law. The attractive,° rather than repulsive, 
character of gravity is due to the 2 units of spin carried by the graviton. Whether a force is 
repulsive or attractive can be traced back, in some sense, to the difference between space 


and time.’ 
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Brane world 


Given this discussion, you might worry that the brane world is untenable. Won’t the 
inverse square law get unacceptably modified? As we just saw, the inverse square depends 
on the dimensions of space. Indeed, the reader might remember working this out in 
exercise IT.1.2. 

So suppose there are n extra dimensions, with coordinates x*, x°, ---, x3". Since (1) 
follows from rotational invariance, it continues to hold, except that the delta function 6 
on the right hand side is to be replaced by 53+”, appropriate for the (3 + n)-dimensional 
space we are now in. 

No sweat, the poor man says, I can solve this equation in an instant too: now G+”) ~ 
1/r3*", so that V/r? x 1/r3+”, thus giving V ~ 1/r”*!. The richer man, who has learned 
Fourier analysis, also obtains this result with scarcely more labor. He writes 


zd 1 
3+n ik-x * 
vinx fd kellie oa (3) 


as 
pnt2* 


and obtains a force decreasing like This sure is not your grandparents’ gravitational 
force law! 

Doesn’t (3) contradict observation immediately? Well, no, because Newton's law con- 
tinues to hold for r >> R, where R denotes the characteristic length scales associated with 
the extra coordinates. In this regime, the extra dimensions are so small, compared with 
the length scale r of the phenomenon we are interested in, that the gravitational flux can- 
not spread far in the direction of the n extra coordinates. Think of the flux being forced to 
spread in only the 3 spatial directions we know, just like the electromagnetic field in a wave 
guide is forced to propagate down the tube. Effectively, we are back in (3 + 1)-dimensional 
spacetime, and V(r) reverts back to a 1/r dependence. Another way of seeing this is that 
in the limit R — 0, effectively there aren’t any extra dimensions. 

You can put it somewhat paradoxically by saying that, in all actual situations involving 
gravity, not only is the large extra dimension not large, but it is effectively zero. 

The new law of gravity (3) holds only in the opposite regime r < R, when the two masses 
are very close to each other. Heuristically, when R is much larger than the separation 
between the two particles, the flux does not know that the extra coordinates are finite 
in extent and thinks that it lives in a (3 + n + 1)-dimensional universe. Thus, we should 
look for deviation from the inverse square law at short distances. At present, the force law 
has only been checked® down to r ~ 1 millimeter or so, which is huge compared to the 
Planck length. Because of the weakness of gravity, Newton’s force law has not been tested 
to much accuracy at laboratory distance scales, and so there is plenty of room for theorists 


to speculate. 


The true scale of gravity 


I already mentioned, way back in the introduction to this book, that the immensity of the 
Planck mass Mp, numerically ~10'° times larger than the proton mass M,, is responsible 
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for the Mother of All Headaches plaguing fundamental physics today. Why is the intrinsic 
mass scale of gravity so large compared to anything else* we know? 

An attractive feature of the large extra dimension scenario is that the true mass scale of 
gravity Mzc may be lowered considerably past Mp and thus alleviate this so-called hierarchy 
problem. 

The preceding discussion and (3) tell us that the gravitational potential between two 
objects of masses m, and m separated by a distance r < R is given by 


mim) 


Vy=- forr<R (4) 


(Mg)*" 7" 
Note that the dependence on Myc follows from dimensional analysis: two powers to cancel 
m my and n powers to match the n extra powers of 1/r. 

In contrast, for r >> R, as we have argued, the geometric spread of the gravitational flux 
is cut off by R, and the potential reverts to the familiar 1/r dependence. Thus motivated, 
we replace n powers of r in (4) by n powers of R to obtain’ 


1 3 ee ae forr>>R (5) 
2+n pn 
( Myc) R"r 
Comparing this with the observed law V(r) = =e i, we manage to determine the 
P 


true scale of gravity: MZ = (Myc)?*"R" = Ly as Ip”. In the last step, we rewrote the left 
hand side using /p = 1/Mp. In other words, 


Mig = (Ip/R) Mp (6) 


so that, if R//p could be made large enough, we would have the intriguing possibility that 
the fundamental scale of gravity My¢ might be much lower than what our grandparents 
thought. Equivalently, the size of the large extra dimensions is given in terms of the Planck 
length by 


a cl ae 7 
. Gr) ”) 


Suppose the true scale of gravity is as low as Mc ~ 10 TeV = 10* GeV, this being 
the energy regime that the Large Hadron Collider can explore* then R ~ (1015) 2" 10-7 
centimeter. We see that the n = 1 case is already ruled out, but n > 2 is still allowed. 
Evidently, as n > 00, R7!—> Myc. Thus, R is bounded on one side by our desire to lower 
the fundamental scale of gravity and on the other by experiments. 

You might have noticed that R large implies the appearance of a small mass p = 


(MES) “Mp = R7|. In particular, for Mpc ~ 10 TeV and n = 2, we have 


pe = (10~1)*101? GeV = 1077 ev 


* In particular, the electroweak scale at ~ 103Mp. 
Tt We are not interested in numerical factors of 0(1) here. 
+ Outside the theoretical physics community, this is known as wishful thinking. 
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much smaller* than the typical scale of particle physics. In this sense, this scenario is not 
an entirely satisfactory solution of the hierarchy problem. 

The beginning student might wish to skip the rest of this chapter. Unlike the material 
in the bulk of this book, the ultimate value of any specific brane world model to physics is 
far from certain. 


A 2-brane model 


As an interesting alternative to the picture outlined in the introduction, Randall and 
Sundrum proposed a 2-brane model.? Consider a 5-dimensional spacetime with the fifth 
coordinate y = x° restricted to 7R > y > —7R, with the points y and —y identified. A 
circle with pairs of points thus identified, namely S!/Z,, is known as an orbifold.!° The 
action is taken to be 


S= / d*x dyV/—G (FM2R) + As) + Shranes (8) 


As in chapter X.1, we denote the scalar curvature constructed out of the (4 + 1)-dimen- 
sional metric G yy by’ R(G) to distinguish it from the scalar curvature R(g) constructed 
out of the (3 + 1)-dimensional metric g,,,,. (We denote the 5-dimensional coordinates by 
x” — (x", y), with M =0, 1, 2, 3, 5and x =0, 1, 2, 3.) Here the mass Ms is the analog of 
the Planck mass Mp for 5-dimensional spacetime. (As in chapter X.1, dimensional analysis 
dictates that this mass, which sets the scale of 5-dimensional gravity, be cubed, since the 
scalar curvature R (in any spacetime dimension) contains two derivatives acting on the 
metric and hence has mass dimension 2.) Also, we denote the determinant of g,,,, and 
Gyy by g and G, respectively. Note that As, the 5-dimensional analog of the cosmological 
constant, has dimensions of mass to the fifth power. 

Our brane, namely the brane we live on, is placed at one end of this spacetime at y = 7 R, 
and another brane, known as the Planck brane, is located at the other end at y = 0. In other 
words, we write 


Poranes = / d*x dyV—G (50) (Ap = Lp) +dé(y—wk) (Ao cH Lo)) (9) 


Here Ap, Lp, Ao, and Lo denote the cosmological constant and the Lagrangian of the 
matter fields on the Planck brane and on our brane, respectively. In what follows, we 
mostly set all the matter fields on the two branes to 0, that is, we simply ignore Lp and Lo. 

The 5-dimensional Einstein field equations are obtained, as usual, by varying S. Follow- 
ing Randall and Sundrum, let us look for a solution of the form 


ds? = —e On dx"dx" + dy? (10) 


* Notice in passing that this happens to be roughly of order of the cosmological constant mass scale and also 
the “typical” neutrino mass. Is there something here? 
¥ Not to be confused with the radius R of the orbifold! 
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with a function w(y) known picturesquely as the warp function. In other words, G,,, = 
=e ny, Gus =0, Gss5=1. In contrast to the simple picture presented earlier, G,,, 
depends on y. 

A straightforward computation shows that the only nonvanishing Christoffel symbols 
are T°, = mye ”’w' and I’; = —w’, with w’ =dw/dy. Since the Riemann curvature 
tensor involves the sum of dl’ and I'T’, we are not surprised that the nonvanishing 
components of the Ricci tensor work out to be 


Rywy=— (4w? — w") ree |r 

Rss = 4 (w? — w") (11) 

Away from the branes, that is, for y 40 and y A7R, Spranes do not contribute, and the 
field equations read simply 


Run= GuwAs/Me (12) 


To solve this, we first note that we can rewrite the result in (11) as R,,, = (4w’? — WIG a 
and Rss = 4(w”? — w")Gss. For R,,, and Rss to be proportional to G,,,, and Gs, respectively, 
with the same proportionality constant as required by (12), we see that w” must vanish, so 


that w(y) is a linear function of y. Furthermore, 4w’? = —As;/Me, so that w(y) =+y/L, 
with L a length scale defined by 


pe (-4m3/a.)? (13) 


thus indicating that this whole setup works only if A; < 0. Imposing the orbifold symmetry 
w(y) = w(—y), we obtain w(y) « |y|, for y 40 and yATR. 

Next, the 5-dimensional spacetime has to notice the presence of the branes at y = 0 
and y =z R. In other words, we have to solve Einstein’s field equation (12) amended by 
As > As +6(y)Ap + 6(y — 2R) Ao. Observe that with the solution w(y) « |y|, the slope 
w'(y) flips sign as we cross y = 0, so that indeed, w”(y) behaves like a delta function 
at y = 0 and similarly at y = aR. (Some readers might be reminded of solving the delta 
function potential problem in quantum mechanics.) Integrating the MN = 55 equation 
in (12) 


4M2(w”? — w") = — (As + 8(y) Ap + 8(y — 7 R) Ao) (14) 


across y = 0 from y= 07 to y=07, we obtain 


ot 
/ dyw"(y) = w'|o" = Ap/ (4m?) 
id 


Similarly, integrating across y= R from y=2R™ to y=7R™, we obtain 


wRt fa 
[Fo  ayw'y = with! = Aoy (4m) 
rs 


Recalling the condition w(y) = w(—y) and sketching the function w(y), you see that 
the jumps w/o" and wiess and hence the cosmological constants Ap and Ag on the 
two branes, must have opposite signs. There is also a sign choice at this point; for the 
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considerations below to work, we choose w(y) = +|y|/L, and thus obtain!! 


1 
Ap=—Ag =8M3/L=4 (—M3As) : (15) 


In this brane world scenario, a fine tuning between the parameters Ap, Ao, and Ms is 
required. 

Some readers may have already noticed that the metric (10) with w(y) linear is just the 
anti de Sitter spacetime studied in chapter IX.11; see in particular (IX.11.18). This also 
explains why the Ansatz (10) leads to a solution of Einstein’s field equation so readily. 

We are now ready to extract some physics from this model. First, allow the metric on 
our (3 + 1)-dimensional world to fluctuate,* replacing (10) by ds* = —e~7!!/g (x) + dy’. 
How much action do we have to “pay” for this fluctuation? 

Simply plug G,,,(x, y) = e77P"/£¢ (x), Gus(x, y) =0, and Gss5(x, y) = 1into (16) and 
evaluate the gravitational action 


zR 
S~ M2 / d*x (/ aye 20) J—gR(g) (16) 
—WR 


We thus identify the Planck mass by 
mR 
M2=2M3 [ dye?!" = MBL (1 i aie (17) 


Inthe large R/L limit, MZ ~ M3L and, once again, the true scale of gravity M; can be made 
to be much lower than the Planck scale. Note that, in contrast to the discussion earlier, due 
to the presence of As, we have 2 length scales available, R and L. In the large R/L limit, 
L plays the role of R in (6). 

Next, consider a scalar field with mass m on our brane, thatis, set Lp = —{(0y)* + m7} 
—22 R/L 


in (9). Since G,,(x, y= R) =e Sy (x), the effective action we see in our world is 


given by 
s~ f atey=ge**lt (e177 8/4 gH", vay + mp?) 


= f atev=a (2"°,08,5 + mee" 9?) (18) 


—aR/L 


where @ =e gy is normalized to have the correct kinetic energy term. Supposing 


the “true mass” m of the scalar field to be of order Mp, we can lower the physical mass 


—m R/L 


My =e m to the electroweak scale by choosing R/L ~ 10. 


Speculations on brane worlds 


Subsequently, Randall and Sundrum realized that the separation between the 2 branes 
could be taken to infinity, thus effectively leaving us with a 1-brane model,!? which 


* Note that we are keeping frozen other degrees of freedom in G yy. 
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brane 


bulk bulk 


“ls 


Figure 1 You are sitting on the 
brane minding your own business, 
and some wave could come in from 
the bulk and hit you. 


has become more “fashionable” than their 2-brane model. This work stimulated a vast 
literature, amounting to literally thousands of papers. I obviously cannot provide a survey 
of the literature here. I merely mention that one interesting application is to cosmology. At 
this point, it ought to be clear to you that any specific brane world model should be taken 
as merely suggestive. You, the astute reader, probably realize that, with the vast expertise 
on classical general relativity you now have, you too could join in the fun. So, a word of 
encouragement: The full story of higher dimensional spacetime is yet to be written, and it 
could well be written by you! 


Apparent violation of causality 


The brane world model discussion given here is entirely static. Obviously, it would be 
interesting to introduce time dependence. In the appendix, I discuss one early attempt. 
Here I point out what I regard as a serious difficulty with all such attempts. 

Dynamical discussions of our universe as a brane generically suffer from the awkward 
feature that evolution requires not just initial data on the brane, but also initial data for 
the bulk fields as well. In other words, if you are sitting on the brane minding your own 
business, some wave could come in from the bulk and hit you (figure 1). I like to call this 
the “finger of God” problem. Observers living on the brane would see apparent violation of 
causality, not to mention violation of energy conservation. Various happenings that have 
never been seen! In other words, there are a large number of degrees of freedom living 
in the bulk that we do not have direct access to. To me, this is one of the least attractive 
features* of the brane world. 


* Of course, this has not prevented any number of persons from authoring any number of papers discussing 
the apparent violation of any number of “sacred” concepts. When and if experimentalists see these violations, 
you and I could always revisit the discussion here. 
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Appendix: Outgoing brane wave model 


As promised, from the vast literature on brane worlds, here I briefly describe the outgoing brane wave model!3 
to give you a flavor of the sort of calculations one can indulge in. Also, the following discussion involves solving 
Einstein’s equations in light cone coordinates, something that might be of methodological interest. Let me first 
set up the model and then describe its physical properties. 

Consider a (4 + 1)-dimensional world (with coordinates ¢, x’, and y) containing a (3 + 1)-dimensional brane 
at y = 0 and described by the low energy effective action* 


s= | axv G[RG) $(ay)"| fas Be Lai (19) 


The metric g,, = aT 8% Gyy is the (3 + 1)-dimensional metric induced on the brane, and £3,, denotes the 
Lagrangian of the (3+ 1)-dimensional world. (Here we set M; to 1.) Note that we do not include a bulk 
cosmological constant. In contrast, a dilaton field y, as suggested by string theory, is included, with b taken 
here as a free phenomenological parameter that ultimately may be determined by string theory. 

Einstein’s equations read 


g 
Ryn — 3GunR = 4 [auvane = $Guno)| ar i/é Thy uu én 5(y) (20) 


(Evidently, dy = aw and so on.) 
Assume the geometry on the brane to be homogeneous and isotropic, so that the stress energy tensor is 


required to take the form 


TH = —e°? AS" + e¥diag(—p, P, P, P) (21) 


describing a (3 + 1)-dimensional world containing a cosmological constant or vacuum energy A and a perfect 
fluid with energy density p and pressure P, with A, p, and P in general functions of t. 
Upon varying the action with respect to the dilaton field y, we obtain its equation of motion 


$09 =— J] F be L318) (22) 


The original paper on this model includes a perfect fluid. But if] include the perfect fluid, at this point, I would 
have to digress and explain to you what £3,, is for a perfect fluid.4 Without the perfect fluid, we have, in (22), 
simply £3,, = —A. But since this only affects the dilaton equation of motion, much of the following discussion 
goes through even if the perfect fluid included. Thus, I will proceed to discuss the more general situation, with a 
perfect fluid included. For those readers who like to see everything derived rather than simply stated (and Iam a 
member of this class), I will simply alert you when the explicit form of £3, for a perfect fluid is actually needed. 

Consider solutions depending on both t and y that are homogeneous and isotropic in the three transverse 
directions, and write the metric in the form 


ds? = eA») ( dt? 4 dy’) + e2BUy) (ax} + dxi + dx}) 
= —e7AMY) dy dy + Bu) (ax} + dx} + dx}) (23) 


with the light cone coordinates u =t — y and v=t+y. The metric for the (3 + 1)-dimensional universe on 
the brane can be written in the standard form of a Friedmann-Lemaitre-Robertson-Walker universe ds* = 
—dt? + a(t)*(dx? + dx? + dx?), with t = f dt eA“ and a(r) = e284», 

Now solve Einstein’s equation (20) and the dilaton equation (22) in the bulk, that is, away from the brane, to 
obtain A(t, y), B(t, y), and y(t, y) for y > 0 and y < 0 separately, and then match the solutions across the brane. 
Itis convenient to use the (u, v) light cone coordinates to solve the equations in the bulk, and the (t, y) coordinates 
to do the matching across the brane. Before reading on, you are cordially invited to solve these equations. 


* The 4 is conventional and can be absorbed in b. 
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The bulk equations of motion are (with the notation B , = eB Bay = a and so forth) 
Buy t+3B,B,=90 (24) 
20 t3 (Bu Pvt By Pu) =0 (25) 
A yBy— 4 (Bu? + Bu) — 30.47 =0 (26) 
A yBy— 43 (By? + Bw) — 39,7 =0 (27) 
20.uPvt3 (Aw + 2B uw + 3B, By) =0 (28) 


These equations are, respectively, the wv component of Einstein’s equation; the dilaton equation of motion; and 
the uu component, the vv component, and the ii component of the Einstein equation. (The last equation is not 
independent of the first four by virtue of the Bianchi identity.) 

As mentioned above, we have to match the solutions for y < 0 and y > 0. The matching conditions for the 
metric and for the dilaton at the brane can be obtained by writing out (20) and (22) in the (¢, y) coordinates and 
integrating across y = 0 at fixed r: 


6 4] = — et+be] (A — 29 — 3P) (29) 
y 
6] =e aren (30) 
y 
a 
; 4 = be***| (A + p) (31) 


Alert! To write down the right hand side of (31), you integrate (22), and hence you have to know what £3, ; for 
a perfect fluid is. (Note that, in contrast, to write down (29) and (30), you merely have to know what the stress 
energy tensor of a perfect fluid is, which you have known since part III.) So, if you insist on not taking anybody’s 
word for what this is, you could simply set p = 0 and P = 0 in (29)-(31). 

Note that these matching conditions have to be satisfied at any instant in ¢, with the two sides in (29)-(31) 
functions of t. Thus, we have the rather strong constraint that 


%,y] = =3 bB | (32) 


In many solutions, this condition forces ¢ to be proportional to B. 

Without the perfect fluid, we see from (29) and (30) that A equals B up to a possible additive constant. 

We now have to deal with the “finger of God” problem mentioned in the text. The simplest and most natural 
assumption is that the bulk degrees of freedom simply respond to motion on the brane, but they do not act on 
the brane. In this spirit, the authors of the outgoing brane wave model simply decree by fiat that there are no 
incoming bulk waves, only outgoing waves. 

A simple class of solutions to the bulk equations (24)—(28) consists of taking A, B, and g to depend only on 
u. Of the five bulk equations, only one, namely (26), survives: 


2 i ae 
A By (8, U Boas) 20 0 (33) 


This corresponds to a plane wave propagating to the right. Similarly, solutions depending only on v exist, 
representing plane waves moving to the left. 

But if A depends only on u, we see by inspecting (23) that it can be absorbed by reparametrizing u. Then (33) 
determines B in terms of 9, or vice versa. Physically, it makes sense that there are no independent gravitational 
degrees of freedom, since we have assumed isotropy in the transverse x! space. (Gravitational plane waves, as 
discussed in chapter IX.4, expand some directions while contracting others.) 

We now construct a solution of (20) and (22) by matching a solution depending only on u (for y > 0) toa 
solution depending only on v (for y < 0). The result is a solution in which the bulk spacetime consists of plane 
waves moving away from the brane on both sides. 

Write B(u, v) = log h(u) on the y > 0 side. The continuity of B implies that B(u, v) = log h(v) on the y <0 
side. Since, according to (32), the jump in g , must be proportional to the jump in B , for all r, g itself must be 
proportional to B: 


g(u, v) = -3 b log h(u) (34) 
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for y > 0, and g(u, v) = —2b log h(v) for y < 0. Note that an additive constant in g(u, v) can be absorbed by 
scaling h, and the consequent additive constant in B can be absorbed by scaling x'. We can now immediately 
integrate (33) to obtain* 


Alu, v) = (1+ 36”) Blu, v) + 4 log B. (u,v) 
= 2b? log h(u) + 4 logh’(u) (35) 


for y > 0, and A(u, v) = 2b? log h(v) + 4 log h'(v) for y < 0. An additive constant can be absorbed by scaling wu 
and v. 

The function h is determined once we pick an equation of state for the matter on the brane. In other words, 
the equation of state fixes both the amplitude of the bulk waves and the dynamics of the brane geometry. 

A simple example is to let the total pressure and total energy density be linearly related: P— A= y(p+ A), 
with y a constant. Then the matching conditions (30) and (29) imply that the jump in A_, is proportional to the 
jump in B ,. Since this must hold for all time, we obtain 


A=—-(2+3y)B+k (36) 


for some constant k. This yields a first order equation for h that can be integrated explicitly. 

Without a perfect fluid, P and p vanish and so y = —1, leading to A= B +k as anticipated. 

To illustrate some of the features of this model, we start with a particularly simple special case. Inspection of 
(33) shows that we can choose 9, A, and B to be linear functions of u on the y > 0 side and linear functions of 
v on the y < 0 side, respectively. This corresponds to setting 


h(t) =e (37) 
where we assume that the constant A is positive. Thus, on the y > 0 side, B=Au, p= -3 bi.u, and 

A=} (1+ 907) Aut} loga (38) 
and similarly on the y <0 side. We now set b = +3. (The case of general b will be considered below.) Then 
A= B + const, and it follows from (36) that y = —1, so the stress energy on the brane is a pure cosmological 


constant. From (30), the vacuum energy is 
A=1222 (39) 


The bulk metric is 


ds? = e¢-») [ ndt? + Ady? dx;ds'| (40) 
for y > Oand 

ds? = e+») [ ndt? + rdy? dxjdx' | (41) 
for y <0. Changing to cosmological time t = e*!/./2, we see that the metric on the brane at y = 0 has the 
Friedmann-Lemaitre-Robertson-Walker form ds? = —dt* + At?dx;dx', which after scaling x', gives 

ds* =—dt? + t7dx;,dx! (42) 


Surprise! Even with a cosmological constant A, we have a universe with the scale factor growing linearly 
a(t) =T rather than the usual exponential growth. Remarkably, the expansion rate is independent of the value 
of the vacuum energy A (as long as it is nonzero). In the literature, this is known as self-tuning. 

The solution here does not have naked timelike singularities in the bulk, which appear to be a generic feature 
of the static solutions discussed in the literature. Rather, the factor e**"—») = e" in (40) shows that the spacetime 
has a null singularity at uw = —oo on the right of the brane. Similarly, it also has a null singularity at v = —oo 
on the left. Null geodesics from these singularities reach the brane by finite affine parameters, so they are not 
really at infinity. This is most easily seen by introducing a new coordinate U = e”*" /2, so that the metric for y > 0 
becomes 


ds? = —dUdv + 2Udx,dx' 43 
L 


* Throughout, it is understood that all functions inside the logarithm have absolute value signs. 
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The singularity is now at U = 0. The brane is located at u = v or U = e**” /2, so the brane never actually hits the 
singularity, but instead becomes asymptotically null as v > —oo. The geometry on the left of the brane is similar, 
with the roles of u and v interchanged. 

Next, let us determine the Newton’s constant G seen by observers on the (3 + 1)-dimensional brane. We go 
through the same discussion as in the text, considering a fluctuation of the metric g,,,, on the brane and evaluating 
the corresponding action. As in (17), we end up with an integral over y. Since the null singularity cuts off space in 
the fifth direction, we obtain a nonzero effective G. But since the distance of the singularity to the brane changes 
with time, G will be time dependent, posing a serious problem for this model. 

Remarkably, we can find solutions of the bulk equations (24)—(28) that are much more general than the plane 
waves discussed so far. The key is to observe that (24) involves only B and can be solved to give 


B(u, v) = 5 log(f(w) + g(v)) (44) 


where f and g are two arbitrary functions. (A possible additive integration constant can always be absorbed by 
an overall scaling of f and g.) Given B(u, v), we can solve the dilaton equation of motion (25). Using separation 
of variables, the general solution can be written as 


glu, v) / dk il, (45) 
VFM = HE) + 
with some smooth function c(k). These solutions can be used to study a phase transition in which the vacuum 


energy changes on an initially static, Poincaré invariant brane. In one solution, the brane becomes time dependent 
after the transition. 


I refrain from giving more details here and simply refer the interested reader to the original paper on the 
outgoing brane wave model. It is at least mathematically interesting that the coupled Einstein and dilaton 
equations appear to have many solutions. As I said earlier, you now know enough to contribute, if you wish, 
to the brane world literature. 


Notes 


1. At the level of a popular book, I offer the following analogy. A biologist puts a few drops of water from a 
pond between 2 thin glass plates. To the life forms in the 2-dimensional world between the plates, the world 
outside the glass plates is beyond comprehension. Yet streams of photons can pass back and forth between 
the 2-dimensional world and the world beyond. See Toy/Universe, p. 250. 

2. See the original papers by J. Polchinski. For a textbook on brane physics, see C. Johnson, D-Branes, Cambridge 
University Press, 2006. 

3. The notion of large extra dimensions goes back to V. Rubakov and M. Shaposnikov in 1983 and to M. Visser. 
More recently, the subject was developed in 1990 by I. Antoniadis, and then in 1998 by N. Arkani-Hamed, 
S. Dimopoulos, and Dvali, and by G. Shiu and S.-H. Tye. 

1 


4, Letme remind you how this works. The 1-dimensional integral f(x) = = f a dke* = x f fe dk cos(kx) = 
sin(Kx)/(x). In the limit K — oo, this function f(x) approaches the delta function 5(x), since f (0) = 
K/x —> ov, and for x £0, it oscillates rapidly with an amplitude that quickly tends to 0. The identity used 
in the text is the trivial 3-dimensional generalization of this. 

. QFT Nut, chapter 1.4. 

. QFT Nut, chapter 1.5. 


. More precisely, the repulsion between like electric charges is due to this sign flip between nog and ;;- 


oN AM 


. At this distance scale, the gravitational force is easily overwhelmed by the electromagnetic force, and on the 
nanoscale, even by Casimir forces. 
9. L. Randall and R. Sundrum, Phys. Rev. Lett. 83 (1999), p. 3370. 

10. D. Tymoczko, A Geometry of Music, Oxford University Press, 2011. See p. 410. Note that Pythagoras, whose 

influence pervades this entire textbook, was also into geometry and music. 

11. This relation has emerged from some papers by P. Horava and E. Witten. 

12. L. Randall and R. Sundrum, Phys. Rev. Lett. 83 (1999), p. 4690. 

13. G. Horowitz, I. Low, and A. Zee, Phys. Rev. D62 (2000), p086005. 

14. Itis £ = —p. See, for example, S. Endlich, A. Nicolis, R. Rattazzi, and J. Wang, JHEP 4 (2011), p. 102. 


Effective Field Theory Approach to Einstein Gravity 


Powers of derivatives and the long distance expansion 


Back when I airlifted you to the Einstein-Hilbert action, you might have asked, “The scalar 
curvature R is not the only coordinate scalar we could have formed out of the metric. 
What about other possibilities?” When I teach Einstein gravity, someone usually asks this 
question. I would respond that R is the only coordinate scalar involving two powers of 
derivatives 0. True, we also have the scalars R?, R,yR"”, and Ryypo R“”?, but they all in- 
volve four powers of derivatives. To understand better what to do with these possible terms 
in the action, let us step back and examine the much simpler case of Newtonian gravity. 

Way way back, in chapters II.1 and II.3, I reminded you that in Newtonian gravity, the 
gravitational potential ® satisfies Poisson’s equation V20(x) = 47 Gp (x). Let’s see how a 
really poor man, an impoverished man who doesn’t know how to solve partial differential 
equations, would determine ® around an object of mass M and radius R. The density is 
easy, he says, p ~ M/R?. Mired in poverty but nevertheless smart, he next approximates 
the derivative V® by ® divided by the relevant distance scale ~R, so that V® ~ ®/R and 
V*® ~ ®/R?. Then he writes* 


VV’? ~ &/R? ~ Gp ~ GM/R? (1) 


which requires only algebra to solve, giving ® ~ GM/R, the right answer. No need to take 
a fancy’ course in partial differential equations! 

Now the same guy who wants to know why we didn’t add terms involving four powers 
of derivatives, such as R?, to the Einstein-Hilbert action might also ask why Newton (or 


* You will recall that we used a similar argument in the preceding chapter. 

+ Beginning students often snicker at this sort of getting an answer by “winging it,” compared to solving a 
partial differential equation in all its glory, complete with factors of 27 and what not. But in fact, in cutting edge 
research, the ability to do the former is often much more prized than the ability to do the latter. On the cutting 
edge, the analog of the partial differential equation is typically not known, but the truly great theorists are often 
able to grope for what they want in the dark “by the seat of their pants.” 
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Poisson) did not add terms involving four powers of derivatives to the equation determin- 
ing ®. Indeed, we could invite ourselves to write 


V20(x) + PV°V70 = 4 Gp (x) (2) 


By dimensional analysis, we have to introduce an unknown length scale / characterizing 
the deviation from Newtonian gravity. The impoverished man kindly solves this equation 
for us: 

pe pt 


® 
2 Dose 
V°B+1VEIVE® mt Re RB (3) 


Hence 


GM GM i? 
°~e+8) - (1 +) 


and we see that the effect of the added term is negligible for / « R. Indeed, we can reach 
the same conclusion by looking directly at the postulated equation (2). 

We can also turn the argument around. The fact that deviation from Newtonian gravity 
has not been observed down to a certain length scale allows us to set an upper bound* on 
the unknown length /. 


Einstein-Hilbert action as merely effective 


This simple but elegant argument forms the basis of the so-called effective field theory 
approach! emphasized by Ken Wilson and others and is much used in contemporary 
theoretical physics. 

In the context of gravity, yes, we are certainly more than welcome to add higher derivative 
terms to the Einstein-Hilbert action, so that 


s= m3 [ a*x/=2R 
> Mp / dhe =e (R+P (AR? + BRyyR + YRyvpo RM?) + ---) (4) 


Again, high school dimensional analysis forces us to introduce a length /, and 3 numbers” 
a, 8B, and y of order unity. The ellipsis indicates terms involving cubes and ever higher 
uvpo R“? RY®. Since the only length scale 
we know associated with gravity is the Planck length /p, we naturally assume that / = Ip. 


powers of the curvature tensor, objects such as R 


However, in theoretical physics, we should of course always keep an open mind. The verity 
of this commonly made assumption has to be checked by experimentalists. Indeed, this 
is one of the issues concerning gravity, an issue we will come back to in chapters X.7 
and X.8. 


* The length / characterizes the distance scale at which deviations from our present knowledge of gravity 
might show up. As of this writing, ] S 1mm, as I have already mentioned in chapter X.2. 
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Let us go ahead and assume / = /p. We now use the same argumentas before. In working 
out the gravitational field in some given physical situation, we effectively convert the 
derivative 0 acting on the metric in the action, and hence in the equations of motion, into 
~1/L, with L some characteristic length scale over which the metric varies. Thus, we expect 
the effects of the higher order terms R%, RyyRY, and Ryypg R“Y?? (known as the Weyl- 
Eddington terms?) to be suppressed by ~(//L)?, which normally is almost infinitesimal if 
| = Ip. The effects of the terms represented by the ellipsis in (4) are suppressed even more 
severely. This explains why it sufficed to keep only the Einstein-Hilbert term in the action 
back in chapter VI.1. 


Effective field theory 


The example of gravity suffices to show how the effective field theory approach, which 
pervades contemporary particle and condensed matter physics, works. We classify all 
possible terms in an action by powers of derivatives. The relative coefficients of these 
terms are then fixed by dimensional analysis to be some inherent length / (possibly Ip 
in the case of gravity) raised to the appropriate powers. The effects of various terms are 
then controlled by various powers of (//L), with L some characteristic length scale of the 
physical phenomenon* we are studying. 

Condensed matter physicists like to think in terms of distance, but particle physicists 
tend to think aboutan energy or mass scale.* Thanks to Planck’s h, classification in terms of 
an energy scale is equivalent to classification in terms of a distance scale, but conceptually 
they should be kept distinct. For example, the scalar curvature R has mass dimension 2, 
while R?, Raph? ane Ry oa 

Our discussion above indicates that a term of mass dimension® p in the action must 


Re? have mass dimension 4, and so on. 


have a coefficient that, according to high school dimensional analysis, goes like 1/M?~‘. 
Here M denotes some (usually unknown) mass scale at which the physics associated with 
that term kicks in. Thus, in a process characterized by energy E, the effects of that term 
would be of order (E/M)?~‘. This is one reason particle physicists are always clamoring 
for higher energy accelerators. We will come back to this point in chapter X.8. 

Our friend the Smart Experimentalist speaks up, “Indeed, it would be the height of 
hubris, almost inimical to the spirit of physics, for you theorists to suppose that your 
action® du jour is actually the ultimate. The established actions in physics describe Nature 
only at the length or energy scales we have explored experimentally.” 

We totally agree. In quantum field theory, all possible terms not explicitly forbidden by 
the symmetries of the theory are mandated, as was already mentioned in chapter VI.2. The 
eternal hope of theoretical physics is that, for a given set of phenomena, keeping only a 
few dominant terms in the action suffices. All the actions studied in this book should be 
regarded in this light. 


* We will return to this point when we talk about the quantum Hall fluid in chapter X.5. 


X.3. Effective Field Theory Approach to Einstein Gravity | 711 


Appendix 1: The cosmological constant paradox once again 


In the text, we added terms with higher mass dimensions than the scalar curvature R. What about terms with 
lower mass dimensions? In fact, the term 1, with mass dimension 0, is also allowed. We are free to write 


1 
samp f atx/=e (; PREPGR? 4 PRR + Ryap OM) + (5 
Cc 


where, again, by high school dimensional analysis we were forced to introduce a length scale Jc which a priori 
may or may not be the same as /. Compare with (4). 

Recall from chapter VI.2 that the “1” term is, once again, the dreaded cosmological constant, with the 
identification of the energy density A as 


A~— (6) 


Without bothering to plug in numbers, we can see that Jc is enormous. From chapters VI.2 and VIII.1, we 
know that to a first approximation, our universe is dominated by the cosmological constant, aka dark energy, with 
the scale factor determined by Einstein’s field equation (@/a)? ~ Gp, so that* H? ~ GA ~ A/M>. Comparing this 
with (6), we see that Jc is the Hubble size of the universe, and so is almost inconceivably larger than /, even if, in 
a departure from conventional wisdom, we take / to be much larger than the Planck length /p. This humongous 
discrepancy amounts to another statement of the cosmological constant paradox.’ Nature flagrantly violates the 
theorist’s cherished naturalness doctrine. 

Since A is an energy density with dimensions M/L? ~ M* ~ 1/L+, we can define a length scale /, associated 
with the cosmological energy density by A = Kt. A possibly illuminating way of writing (6) is 


IN o& Vlply (7) 


where we renamed /¢ as ly, the length scale or size of the universe. Einstein tells us that the length scale associated 
with the dark energy or the cosmological constant is the geometric mean® of the smallest and the largest lengths 
known in physics. Rather mysterious! 

The cosmological constant paradox unmasks theoretical physicists as double-talking snake oil salesmen:* in 
the effective action for gravity, they want / to be tiny on the one hand and /c to be enormous on the other. 

We will come back to the cosmological constant paradox in chapter X.7. 


Appendix 2: Reversal of fortune 


This appendix is for those readers with some knowledge of quantum field theory. Other readers should skip this 
upon first reading of the book. 

The view of quantum field theory as a low energy effective theory sketched here represents a remarkable 
shift in attitude toward quantum field theory over the past 30 years. Traditionally, a term in the action for a 
quantum field theory is classified according to whether its mass dimension is < 4, = 4, or > 4, known respectively 
as superrenormalizable, renormalizable, and nonrenormalizable. Textbooks taught that superrenormalizable 
interactions were nice, renormalizable interactions were what we want, while nonrenormalizable interactions 
should fill us with fear and loathing. 

The reason is simple. As already explained in the text, terms with mass dimension p lead to contributions 
going like (E/M)?~*, and so nonrenormalizable terms with p > 4 diverge badly at high energies. 

In an astonishing reversal of fortune, the nonrenormalizable terms are now welcomed and well liked as terms 
that are inevitably here with us. They are regarded as innocuous, since they are suppressed by powers of some 


* Which of course in this context is just the statement that the first two terms in (5) battle each other to a 
standstill. 
Well, not quite. I exaggerate totally. Forget that I said that. 
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1 
jt 


higher mass scale 
nasty guys. 

Since these nasty guys have nominal mass dimension < 4, there are fortunately only a finite number of them. 
They represent the challenges confronting fundamental physics today, and are in turn known as the Higgs mass 
term, the Einstein-Hilbert term, and the cosmological constant term. The Higgs mass term has dimension 2. The 
Einstein-Hilbert term has nominal dimension 2, which after rescaling? by the Planck mass, becomes dimension 
44+5+6+---. The cosmological constant term has nominal dimension 0, which after rescaling, becomes 
dimension 0+ 1+ 2+ ---. Perhaps there is something seriously wrong with this picture. 

Our present understanding of physics is based on this notion of effective field theory, to which all we know can 
be reduced. Yet there are many questions, many doubts, but no clear answers. Field theory itself, and Einstein 
gravity as an effective field theory, could fail at truly long distances. More in chapter X.7. 


. In contrast, our former friends the superrenormalizable terms are now regarded as 


Appendix 3: Nonlocal cosmology 


With a universe dominated by a cosmological constant A or dark energy, adding local terms to Einstein gravity 
as in (4) will not significantly change large scale Lemaitre—de Sitter cosmology. The field equation is modified by 
additional local terms of the form R,,,yp)R”?, for example. But with the maximally symmetric de Sitter form of 
the Riemann curvature tensor, all these terms reduce to some combinations of the Hubble parameter H times 
8yy, and so only the relation between H and A is modified. 

One way around this situation is to introduce nonlocal terms, for example replacing the Einstein-Hilbert term 
R by Rf (gy R), with D* = (1/./=8)0,(./—gg"" d,) the covariant version of a7. With a suitable choice’? of the 
function f(x), but without having to introduce a cosmological constant A or dark energy, this type of nonlocal 
action can reproduce current observations, including the accelerating expansion. To me, an attractive feature of 
this approach is that quantum field theory with known physics naturally generates this type of nonlocal term via 
loop corrections involving the massless graviton. One may regard these nonlocal terms as due to the cumulative 
effect of the fluctuating graviton, an effect that manifests itself only on cosmological distance scales. A drawback 
is of course that this comes with the freedom of adjusting an entire function"! to fit data. 


Appendix 4: More on the scalar field 


Starting in part V, I have extolled the power of Einstein’s equivalence principle: given a Lagrangian in 
Minkowskian spacetime, in which various Lorentz indices are contracted with Nav and its inverse, we sim- 
ply replace n,,, by g,,, and immediately obtain the corresponding Lagrangian in curved spacetime. For a scalar 
field g(x), we go immediately from £ = — 3 n”d,pd,9 — Vg) toL = — 58!"9, pap — Vg). 

But in the spirit of effective field theory, we are also free to add to £ the term € Rg?, with € some numerical 
constant. Note that this term also has mass dimension 4, just like the term g””4,,0,g. Let the characteristic 
length scale over which g and the scalar curvature R vary be /, and Lp, respectively. Then the relative importance 
of this additional nonminimal term versus the standard kinetic term g”"0,,90,9 is given by ~€(,/L R)-- This 
is another example underlining why the equivalence principle is always formulated with the caveat “in a small 
enough region of spacetime,” as first discussed in the prologue to book 2. Here, the region of spacetime over 
which g varies has to be small compared with the region over which the curvature varies for the equivalence 
principle to hold. 

Note that the energy momentum tensor T”” of the scalar field is corrected by this additional term, resulting 
in what became known as the “new and improved”? energy momentum tensor, much discussed in the field 
theory literature in the 1970s. 


Exercise 


1 Describe how higher derivative terms can be added to Maxwell’s theory of electromagnetism and discuss 
their physical manifestations. 
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Notes 


. QFT Nut, chapter VIII.3, pp. 452 ff. 
. Evidently, one of them can be absorbed into /. 
. As you can imagine, there has been a vast literature regarding these terms going back to our forebears in 


theoretical physics. For an example I know particularly well, see A. Zee, “A Theory of Gravity Based on the 
Weyl-Eddington Action,” Phys. Lett. B 109 (1982), p. 183. 


. Phil Anderson once remarked to me that particle physicists renaming themselves high energy physicists 


was a stroke of genius in terms of getting more funding (in the United States, that is). The name “long 
distance physicists” hardly sounds thrilling, and there may be people so ignorant in Congress as to think 
that condensed matter physics has something to do with condensed milk or the gunk one finds underneath 
kitchen sinks. 


. The scaling dimensions of various possible terms in the action play a central role in quantum field theory. 


See, for example, QFT Nut, chapters III.2 and VL8. 


. Indeed, instead of the venerable R, some authors have proposed f(R), for some arbitrary function f that 


had been revealed to them in the middle of the night. 


. See QFT Nut, chapter VIII.2. 
. This fact has been noted by a number of authors. See, for example, S. Hsu and A. Zee, arXiv:hep-th/0406142 


(Mod. Phys. Lett. A 20 (2005), pp. 2699-2704). 


. As explained in chapter IX.5, h = h/Mp. 

. See S. Deser and R. P. Woodard, arXiv:0706.215v2, and related literature for details. 

. Itis important to distinguish this proposal from proposals to replace the Einstein-Hilbert term R by f(R). 
. It turns out that for some special choice of €, the resulting T“” possesses properties much desired by particle 


theorists. 


Finite Sized Objects and Tidal Forces 
in Einstein Gravity 


Motion of extended objects 


Most texts on Einstein gravity treat the motion of point particles exclusively. So, good, 
watch the particles move happily along geodesics in curved spacetime. 

But in some physical situations, we may have to take into account the finite size of the 
“particles.” One example is the emission of gravitational waves from binary systems. As 
we saw in chapter IX.4, one astounding prediction of Einstein gravity is the existence* of 
gravitational waves. Various sources of gravitational waves have been studied intensively. 
One possible source consists of a black hole of size rg (its Schwarzschild radius) moving 
with velocity v a distance ro from another object, possibly another black hole of similar 
size. As the black holes spiral into each other, they emit gravitational waves with a char- 
acteristic wavelength 4 determined by the orbital period according to 4 = 2mro/v. Thus, 
the physics contains three distance scales: rs, ro, and 4. We will stay within the simple 
“post-Newtonian” regime rs < ro < 4. To leading approximation, the black hole may be 
regarded as pointlike, but we might want to include corrections governed by the small 
parameter rs/To. 


Blue sky and finite sized objects in electromagnetism 


For pedagogical clarity, let us start by retreating to the corresponding problem in electro- 
magnetism. Also, take spacetime to be flat. For a point particle with charge e moving in 
an electromagnetic field, the relevant term in the action, as we discussed in chapter IV.1, 
is given by S,)= f dteA,X", with X4# = 2. 

Instead, consider an extended object, such as an atom or a molecule, that is, an elec- 
trically neutral assembly of charged particles. Let us construct an action for the motion 


* Recall also Newton’s snide remark about “competent faculty of thinking” in chapter IX.5. 
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of this object, say an atom for definiteness, in an electromagnetic field. Since the over- 
all charge vanishes, the point particle term S,,, is absent from the action. The individual 
charged particles in the assembly are of course sensitive to A,,, but the atom as an overall 
neutral collection of charged particles cannot be. Rather, as the worldlines of the individual 
charged particles in the atom traverse different locations in spacetime, the atom can only 
be sensitive to the spacetime variations of A,,, not to A,, itself. By gauge invariance, these 
spacetime variations must package themselves into F,,,. In fact, let us define E,, = F, Hie x 
and B,, = Figk where by (x)= 5€yvon F°". Going to the rest frame of the particle, where 
X° = 1and X! =0, we see that Ey =0, E; = Fig, By = 0, and B; = Fy) = —}€;;,F/*. Thus, 
as the notation suggests, this is just the familiar decomposition of the electromagnetic 


field into electric and magnetic fields. 
Given these considerations, we see that, since Lorentz indices must be contracted, the 
action can only be 


s= fax (—m + cgE,E" +cgB,B" +---) (1) 


given as an expansion in terms of the size of the atom. We will not be concerned with the 
higher order corrections (as indicated by the dots in (1)) due to the size of the atom, only 
with the leading order. (Note that a possible term like f dt F,,,F“” can be absorbed into 
the two terms quadratic in F,,,, already displayed.) The fields E and B are to be evaluated 
on the worldline X*(z) of the particle, of course. Physically, from the discussion above, 
we know that, in the limit where we can neglect the size of the atom, the coefficients c, 
and cz must vanish. Thus, you will be hardly surprised to learn that they are related to 
elementary concepts, such as the electric dipole moment and the magnetic moment of 
the atom. 

Using this action, we can derive a result familiar to everybody, including the proverbial 
guy and gal in the street, namely that the sky is blue. Consider an electromagnetic wave 
of frequency @ scattering on the atom. The quantities E,,E“ and B,,B” each contain two 
powers of derivatives, which, acting on the electromagnetic wave, translate into two powers 
of w in the scattering amplitude. Upon squaring the scattering amplitude to obtain the 
cross section, we conclude that the cross section for an electromagnetic wave or a photon 
of frequency to scatter on an atom or a molecule goes like w*. Thus, as the light from the 
sun traverses the atmosphere, blue light (higher frequency) scatters more than red light 
(lower frequency). As is well known, this explains why the sky is blue. 


The “electric” and “magnetic” components of a gravitational field 


Now that we have derived the action governing the motion ofa finite sized object moving in 
an electromagnetic field to leading order in the object’s size, we are ready to move on to the 
gravitational case, keeping in mind the essential differences between electromagnetism 
and gravitation. As mentioned at the beginning of this chapter, one application would be 
to study the finite sized corrections to the motion of a black hole. 
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One key difference is of course that there is no analog of positive and negative charges 
for gravity: masses are acted upon equally by the gravitational field without regard to race 
or creed. Thus, the action necessarily starts out with the point particle action, which is 
Spp = 

As in the electromagnetic case, a finite sized object would also be sensitive to the varia- 


—m f dt in this context. We don’t have objects that are neutral under gravity! 


tions of the metric in spacetime, and by general coordinate invariance these variations, to 
leading order, must get packaged into the Riemann curvature tensor. Would the Riemann 
curvature appear already contracted into the Ricci tensor and the scalar curvature? Time 
for you to pause and think! 

Well, these two quantities both vanish, according to Einstein’s field equation, in the 
empty spacetime the black hole is moving through. In other words, the black hole can 


only sample the Riemann curvature tensor R_,,,,, itself, not the Ricci tensor and the scalar 


MAvp 
curvature. Thus, any terms we add to S,, must involve the Riemann curvature tensor 
Ryrvp» With the indices not allowed to be contracted with each other. What can they be 
contracted with, then? 

The only thing around is the 4-velocity of the object X“. Due to the antisymmetry of 


Ryrvp» We are not able to contract all 4 indices of R with X/. We can contract at most 


MAvp 
2 indices with X“ to form the two objects 


Eyy(X) = Rurup(X)X*X? and By (X) = Ryayy(X)X*X? (2) 


where Ras) = Enron Rt, (x) /(2./—g) denotes the dual* of the curvature tensor. In- 
deed, these correspond to E,, = F,,,X" and B,, = F,,,X", respectively, in the electro- 
magnetic case. By analogy, the fields E,,,, and B,,, represent the decomposition of curvature 
into its “electric” and “magnetic” components, as suggested by the notation. Perhaps the 
reader is not surprised that these fields now carry two indices instead of one. We now need 
to square them to form scalars to put into the action. 


Finite sized objects and tidal forces 


Hence, to leading order in the size of the object, the action governing its motion in a 
gravitational field is given by 


y= f ar (—m + cgE,yE" + cB, BY +--+) (3) 


with two unknown constants cz and cz. Compare with (1). Note that, since E.. or B.. ~ 
R....X? has dimensions L~? = M? in natural units, cg and cg must have dimensions of 
M~? = L} to match the dimensions of the first term in (3). As expected, they vanish as the 
size of the object goes to zero. 


* The appearance of ,/—g will be explained in chapter X.5. 
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In chapter V.3, we varied the first term in (3) to obtain the standard geodesic equation 
that is at the heart of Einstein’s theory. Here we obtain 


Lie Gare AX Ge x, 
7a TO) = Fe) (4) 


where f“(X(t)) comes from varying the E and B terms in (3). A finite sized body 
experiences a tidal force f” due to the varying gravitational force acting on it. It no longer 
follows a geodesic. Everything makes sense. 


The blue sky effect gets squared in gravity 


The fact that we had to square the “electric” and “magnetic” components of the curvature 
to form the effective action (3) means that the effects of these correction terms are highly 
suppressed. Since the Riemann curvature contains 2 derivatives, the correction terms 
involve four derivatives. The blue sky effect gets squared in gravity: for the scattering of 
a gravitational wave or a graviton of frequency w on a finite sized object, that part of the 
amplitude due’ to the finite size goes like w*! You are not surprised, are you? 


Appendix 


To estimate the magnitude of cg and cg for a black hole, we exploit a rather cute argument’ as follows (cute in 
the sense that it does not involve any tedious computation at all). 

Consider the scattering of a graviton with frequency off a finite sized object (which, remember, is a black 
hole in the problem we are studying). The interaction between the particle and the gravitational wave indicated 
n (3) contributes a term to the scattering amplitude V/ as indicated by 


M~-++++cp,p0*/Mp+--- (5) 


The ellipses in M represent effects of the interactions we have not included explicitly, for example, the one 
originating from the first term in (3) (namely the term responsible for keeping us down to earth!). A nice feature 
of the argument I am about to give is that we don’t even need to know what the (- - -) are. Here cg z denotes the 
two unknown couplings cg ~ cp generically. We have derived the w* dependence just a moment ago. So the only 
unexplained feature here is the power of Mp, which I will derive presently. 

Imagine calculating the total scattering cross section for a graviton on a finite sized object. Squaring the 
amplitude M and so forth, we end up with o(w) ~ +--+ c}, ,@°/M§ + ---. We can use dimensional analysis 
to determine the power of Mp here (and hence the power of Mp in M) as follows. The cross section o has 
dimension of an area and hence the dimension M~*. We had determined earlier that c £,p has dimension M < 
so that ee. 32 has dimension M~°M8 = M2. Thus, to get the dimension to match, we need to divide by Mj. 
Note that the mass m of the object is not available to make up the dimension. 

We are now able to estimate cy , for a black hole. The preceding treatment of the black hole as almost a 
point particle is only valid for wrs <1 of course, that is, with the Schwarzschild radius much less than the 
wavelength of the gravitational wave. But we argue that by dimensional analysis, the cross section must have the 
form o(@) =r2f (rs), since the only length scale in the Schwarzschild metric is rs. Expanding the unknown 


function f (wrs) in powers of its argument, we have o(w) =---+ yorrt? +--+, with y some numerical constant. 


* This sentence is more awkward to write than the corresponding sentence in the electromagnetic case for a 
very physical reason: Even if the size of the object goes to zero, it still cannot hide from the gravitational field. 
There is no escape from gravity. In contrast, a zero sized atom is invisible to the electromagnetic field. 
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(A technical aside: the massless graviton could produce infrared factors like log wrs, which we ignore for our 
purposes.) 

Requiring that the two expressions agree, we obtain cr, g ~ More . Indeed, as expected, the couplings cz, g are 
highly suppressed as rg > 0. 


Exercise 


1 Work out the “electric” and “magnetic” components of the gravitational field we feel every day. 


Note 


1. This treatment is based on P. Goldberger and I. Rothstein, arXiv: 0409156. I neglect various technicalities, 
such as field redefinition. See QFT Nut for more details. 


Xx. Topological Field Theory 


Not having clocks and rulers means that you are topological 


To do physics, we need clocks and rulers. 

By specifying the separation between events in spacetime, the metric in effect provides 
us with clocks and rulers. Indeed, in the action, we have to use the metric to contract 
spacetime indices, and even if there were no indices* to contract, the spacetime volume 
d‘x,/—g knows about the metric. It would appear that the metric is indispensable for 
writing down the action. 

But is that necessarily so? Dear reader, please pause and think. 

Recall the antisymmetric or Levi-Civita symbol e“”*""*> first introduced in chapter 1.4, 
which we have since met repeatedly, for example in chapter IV.2. Recall also that in 
d-dimensional spacetime, €”"*""S carries d indices with €°!?""»¢—1 = 1, and the rest deter- 
mined by antisymmetry. For example, for d = 4, €703! = —¢2013 = +¢0213 — —¢0123 — _1, 
So, besides the metric, we can also use the antisymmetric symbol to contract indices. 

Offered the antisymmetric symbol, we could contract it with a bunch of vectors or tensors 
to form an object with no free uncontracted indices, for instance T = eS A, B,C, +++ 
Z,. To see clearly what is going on, we specialize to d = 2 and study T = €“"A,B, = 
AB = AaB 

How does J transform? Our friend Confusio might have naively thought that since this 
object does not carry any indices, it transforms like a scalar. But that’s not so: while it looks 
like a duck, it does not quack like a duck. Here we need the definition of the determinant: for 
a matrix M, e?° MM’ = (det M)e“”. You can easily verify that this definition coincides 
with the high school definition. Set ~=0, v =1, for example: err MME = M%M}1- 
M oM 4, = det M, so that det M is indeed the determinant of the matrix M 3 


* Such as in the cosmological constant term in chapter VI.2. 
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. oxl . 
Now we can work out how T transforms. Regarding a as the matrix MM“, we 
obtain 


bh v 
T(x") = 67 A, (x!) By(x") = ae Ags) B(x) = det (=) lA, (x) By(x) 


= det (+) T(x) (1) 
ox’ 


We learned that, in spite of T carrying no indices, it does not transform as a scalar. 

How to deal with that pesky determinant in (1)? As we had already noted when we 
discussed area and volume in chapter 1.5, taking the determinant of both sides of the 
equation os a) = Buy (*) ax ox, telling us how the metric transforms, we obtain g’(x’) = 
g(x) (det(2%))’, where, as always, g(x) = det g,,,(x) is also somebody who fails to be a 
scalar. 

In a manner reminiscent of our discussion in chapter V.6, we can now form the 
combination* T (x)/ Jax), which does transform like a scalar, since T’(x’)/,/g/(x') = 
T (x)//8@). 

Convince yourself that while this discussion was carried out for d = 2 for the sake of 
pedagogical clarity, our conclusion holds for any d. (You can write down more Greek letters, 
can’t your) Specifically, if we have available in a d-dimensional theory an antisymmetric 
tensor T,,y)...¢(), then we can forma scalar! e#"*""T,,,...-(x)/y/g(x). We are thus free 
to add to our action the term 


Stopological -_ / d’xy g(x’ Ty ie (x)/¥ g(x) = / At. Bia ce (x) (2) 


which is invariant under general coordinate transformations. 

The point of this discussion is that, remarkably, the volume factor of ,/g associated with 
d‘x has disappeared. Indeed, Stopological does not know anything about the metric g,,,, and 
for that matter, it does not even know about the flat Minkowski metric n,,,,. In other words, 
it does not know about clocks and rulers! 

We can stretch and deform spacetime without Stopological noticing anything different: that 
guy is topological! The physics it describes is sensitive only to the topology of spacetime, 
not to the metric and the curvature. 


Topological terms in gauge theories 


Let us illustrate how this works with a theory we know and love, namely 4-dimensional 
electromagnetism. Indeed, when the professor in a course on electromagnetism showed 
you the Maxwell action (V.6.18) Syaxwel = —¢ J d*x /—gg""g?" F,, Fy (perhaps written 


* In this chapter, I find it convenient to write, on occasion, ,/g instead of ./—g to minimize clutter. 
T Another way of stating this result is that while «“’*"$ does not transform as a tensor, é4“”*""(x) = 


Elva 71 / a(x) does. 
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only for the Minkowskian case g,,, = ,,,), you could have raised your hand and asked 
about adding the term 


1 
; i d*x MM FOF, = / d*x €"™ (3,A,) (3,4,) = i d'x 0, (eA, 3,Ay) 


a term mentioned only in some of the better texts on electromagnetism. Over the past 
few decades, particle and condensed matter theorists have come to appreciate the role 
played by this term (an example of the term described in general in (2)) and its various 
generalizations. 

Well, a better professor of electromagnetism might have pointed out that the integrand 
in this extra term is equal to 0, (€”?""A,0,,A,) and is therefore a total divergence. The extra 
term you clamored for only depends on A,, at spacetime infinity. Since the action principle 
involves local variations, this extra term does not contribute to Maxwell’s equations of mo- 
tion, and for this reason is normally not mentioned in standard texts on electromagnetism. 
(This action has a number of other interesting features, but I do not wish to pursue them 
here. I might mention only that it is not invariant under time reversal* t  —t and space 
reflection! x > —X.) 


The Chern-Simons term in (2+1)-dimensional spacetime 


As I emphasized, the discussion so far applies for any d. Suppose we are in (2 + 1)- 
dimensional, rather than (3 + 1)-dimensional, spacetime, and suppose that physics is 
governed by the analog of the Maxwell action Syaxwe = —q f Px /—gg*"g9?” Pid wks 
where f,,, = 0,,d, — 0,a,. I write a, rather than A, here to emphasize that I am not 
talking about the electromagnetic potential but rather some gauge potential that describes 
the degree of freedom in some (2 + 1)-dimensional physical situation. In appendix 1, I will 
tell you that there are (2 + 1)-dimensional condensed matter systems that can be described 
by a gauge potential a,,, but for the moment, our discussion is purely theoretical. 

As explained, we can now add the topological term (known as the Chern-Simons term!) 


au / dx ea d,,ay (3) 
JU 


to the Maxwell action. By the way, using the differential forms introduced in chapter IX.7, 
we can write this compactly as 


Scs = ame / ada (4) 
2 


using the identity dxdx’dx* = eh *d3x. 


* | have already mentioned time reversal on several occasions, including chapters III.1 and VIII.1. 
¥ In odd-dimensional space, space reflection (also known as parity) is equivalent to reflection along a particular 
axis (say, xi+-xl xi’ 5 4x',1=2,---,D-1,D odd) followed by a rotation. To see this, note that the 


determinants of the transformations involved are variously +1. 
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Long distance dominance 


Before we investigate this truly amazing state of affairs further, I need to bring up another 
important point, namely how various terms behave at long distances. As explained in 
chapter X.3, at long distances, terms with higher length dimensions (that is, terms with 
lower mass dimensions, to use the language favored in particle physics) dominate terms 
with lower length dimensions (that is, terms with higher mass dimensions). Indeed, let’s 
review the argument placed in the present context. Consider a system whose effective field 
theoretic description’ of the system is given by 


es x fax (0,8, —1V=B8 8” ip fay + °°) (5) 
namely the sum of a Chern-Simons term, a Maxwell term, and so on. 

Since the Maxwell term has two powers of derivatives while the Chern-Simons term 
has only one (and they both have two powers of the gauge potential a,,), high school 
dimensional analysis demands the introduction of a length / characteristic of the system 
we are studying. When we study physical phenomena on a distance scale of L, the effect 
of the Maxwell term is thus suppressed by a factor of //L relative to the Chern-Simons 
term for L >> /. The ellipsis in (5) indicates terms of even lower length dimensions. They 
are thus multiplied by even higher powers? of / and are even more strongly suppressed at 
long distances. 

Suppose we have some 2-dimensional solid state structure with complicated micro- 
scopic physics but such that its long distance degree of freedom is described by a gauge 
potential a,,(x) whose dynamics is gauge invariant (that is, invariant under the transfor- 
mation a,, > a, + 0,,A). Actual solid state structures are of course not Lorentz invariant. 
Thus, the Maxwell term in (5) should be replaced by something like (fo;)? — B(f;;)?, with 
Ba coefficient determined by the microscopic dynamics. You might expect that the Chern- 
Simons term would similarly break up into €'/a;09a; + ydo9;a;). But remarkably, as you 
can readily verify, gauge invariance fixes the coefficient y to have precisely the value that 
allows the 2 terms to combine into €*“"a,d,,a,. For the Chern-Simons term—but not for 
the Maxwell term—gauge invariance implies Lorentz invariance. 

We conclude that, amazingly, the long distance physics of such a system, if it exists, is 
topological and does not depend on the nasty microscopic physics (such as band structure 
and the effect of impurities) that our solid state colleagues revel in. The physics is universal 
and determined completely by the parameter k. More on this in appendix 1. 

Note that in (3 + 1)-dimensional spacetime, the added topological term, as described in 
the preceding section, has the same mass dimensions as the Maxwell term f*, and hence 
it does not dominate at long distances. In (4 + 1)-dimensional space, the topological term 


19-q | f.y.fxo iS less important at long distances than the Maxwell term ~ f?. 


€P 
It is also worth remarking that with differential forms, we can write the topological 
term compactly as (da)" in d = 2n-dimensional spacetime, and as a(da)" in d = (2n + 1)- 


dimensional spacetime. 
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Appendix 1: Quantum Hall fluid and ground state degeneracy 


Is topological field theory merely a theoretical possibility, a curiosity for theorists, or can it be realized physically? 
In fact, the long distance effective theory of the quantum Hall fluid is topological. Unquestionably, a detailed 
discussion of the theory of the quantum Hall fluid lies beyond the scope* of a textbook on Einstein gravity. 
Here I limit myself to saying that the long distance physics for a system of electrons confined to 2-dimensional 
structures in the presence of a strong magnetic field turns out to be given by the Chern-Simons action S¢g (3). 

A topological field theory must feel peculiarly out of place in a book on gravity and curved spacetime! It 
doesn’t know about the metric, a concept central to Riemannian geometry and Einstein gravity. We learned early 
on that the energy momentum tensor is defined by varying the action with respect to g,,,. What if the action 
does not depend on g,,,? Inescapably, in a topological field theory, the energy momentum tensor and hence the 
Hamiltonian is identically zero! As I already said, to determine the Hamiltonian we need clocks and rulers. 

What does it mean for a quantum system to have a Hamiltonian H = 0? Well, when we took a course on 
quantum mechanics, if the professor assigned an exam problem to find the spectrum of the Hamiltonian 0, we 
could do it easily! All states have energy E = 0. We are ready to hand in the solution. 

But the nontrivial problem is to determine how many states there are. This number, known as the ground 
state degeneracy, depends only on the topology of the manifold, not on whatever metric we might put on the 
manifold. It is beyond the scope of a book on gravity to calculate this quantum degeneracy, but it turns out? to 
be equal to k&, where g here denotes the genus of the manifold. (Recall that g = 0 for the sphere, g = 1 for the 
torus,* and so on.) 

Note that this result implies that k has to be an integer. Otherwise, it would be senseless to say that there are 
k® states with E = 0. This fascinating phenomenon is known as topological quantization. 


Appendix 2: The Hodge star operation on differential forms 


Here I discuss the Hodge star operation « on differential forms. While this topic properly belongs in chapter IX.7, 
I had to postpone it until we have discussed the antisymmetric symbol in curved spacetime. Again, for ease of 
writing and clarity of presentation, I will specialize temporarily to d = 2. The diligent reader can readily generalize 
the discussion to arbitrary d as we go along. 

As in the text, we adopt the convention that €°l=1. You might have noticed that we did not ever have to 
introduce the totally antisymmetric symbol €,,,, with lower indices, but here it comes finally. We define it by 
specifying € 9, = —1. (That €°! and € ; have opposite signs is to avoid an overall minus sign in the definition of 
the star operation given below.) With this convention, we have 


ep = 315, + 55) and éMe, = +8; (6) 
Another identity comes from the definition of the determinant 
‘ 1 
SurSrp@? = —8Euy and geen, i= ao (7) 


as was already mentioned in the text. You can verify (6) and (7) by evaluating them for various values of 
and v. In particular, in flat spacetime, €,,, =, % )€"”. Multiplying the first identity in (7) by g’’, we obtain 
Sure” = —8€yvg"”. Multiplying this by €,, then yields 


1 
Env” €op = gee (8) 


(Some readers may recognize this as Cramer’s rule for finding the inverse of a matrix in this context.) 
I use the convention in which e”” and €,,, are numerical, that is, with components given variously by +1, 0. 
The price we pay is that when we raise and lower indices, factors of g will appear as in (6) and (7). We can 


* Incidentally, it is not as far-fetched as it might seem that theoretical physicists would consider systems living 
on a torus. If you study a quantum system in a rectangular domain and impose periodic boundary conditions 
w(x, y)= W(x +L, y) =, y +L) on the wave function, you are effectively putting the system on a torus, 
namely a square with opposite sides identified. 
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define, alternatively, €“” = «""”/,/—g and € wv =EwvV—8, which behave like tensors, as was already mentioned 
in a footnote. As you can easily imagine, there are advantages and disadvantages to both conventions. 


Given a 1-form V = V,dx", define 


*V = (#V),dx" = (V=8eu.V") dx" (9) 


with V* = g*V,, as usual. 
Let us compute 


dxV=3, (/=8e,V*) dx"dx” =a, (V=@V") @x = (4a (v=) V=ed?x (10) 


where in the second equality, we used 
dx"dx! = dx (11) 


and (6). We see that the operation d« acting on a 1-form gives us a 0-form proportional to the covariant divergence 
D,Vi= Is Inks /—gg'*V,), and so it is clearly going to be useful for physics. 

Proceed now to d-dimensional spacetime. Given a p-form V (with p <d), define the (d — p)-form *V by 
generalizing (9) 


Ve (Bein ily tp aig Viena) dx... dxtd-p (12) 


Here we use €,,,...,,, instead of €,,, (of course!). (Some authors define the * operation with an overall factor 
(p\(d — p)!)~1 in (12). We will not get all uptight about these factors. You can fill them in if you so desire.) 

For V a d-form, we obtain the 0-form *V = (./—g€,,,...4,V""""). Consider the 0-form denoted by 1. Note 
the d-form «1 = VBE py egdx -++dx"a, We readily check that, up to a factor, + takes 1 back to itself: « * 1= 
(VHB) Eu jeon yng vee glide, iy, = (-8)/(—8) eye! M4 = dl. Indeed, you can check that ** takes any 
p-form back to itself. I will verify a simple case: act with * on (9) to obtain * * V = /=ey 8?" (/—8Ey, V*)dx” = 
V,dx’ = V, as claimed (we used (8) in the next to last equality). 

Back in chapter IX.7, you learned that the Bianchi identity dF = ddA = 0 corresponds to half of Maxwell’s 
equations. You might wonder, in the language of forms, where the other half is, the half sourced by the current 
J,, in contrast to the half corresponding to the Bianchi identity. Here is the answer: 


dxF=xJ (13) 


where F = dA denotes the electromagnetic field strength 2-form, and J = J, wax" the current 1-form. First, check 
that the two sides match. The right hand side is a (d — 1)-form. As for the left hand side, «F is a (d — 2)-form, 
and sod « F is also a (d — 2) + 1= (d — 1)-form. 

The left hand side of our proposed equation contains one power of derivative 4, and one power of F,,,,, so it 
has got to be the divergence of the field strength, with possibly some factors of ./—g (coming from the definition 
of x) thrown in. Between us friends, it is hardly necessary to verify this claim, but let’s do it anyway, for arbitrary 
d. However, do be a mature adult and not worry about the factorials and signs that are irrelevant for our purposes 
here. We have 


dx F=d (= Beep y aap Pal ae dxté-2) 
= Ig (= Be y-1y a0”) dx° dx! ...dxtd-2 (14) 
while 
aI = (Weenie tual”) dxt1...dxMd-2dxha-1 (15) 


Multiplying (14) and (15) by dx’ and using the generalization of (11) and (6) to d dimensions, we obtain (without 
worrying about signs and such) 4,,(./—g F"") = ./—g J”, as expected. Note that Maxwell’s equations work in any 
spacetime dimension. 

At this point, you might wonder how to write Maxwell’s action using forms. The answer is S= f F x F. 
(Check this!) Note that since *F is a (d — 2)-form, F * F is a d-form, just ripe for integrating over d-dimensional 
spacetime. It is easy to adda current: S = f F « F + A« J. The equation of motion we laboriously derived in (13) 
follows by writing F = dA and varying S with respect to A formally. Again, without worrying about irrelevant 
factors, we obtain immediately 5S = [(d x F + *J)5A = 0 and hence Maxwell’s equations. 
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Figure 1 (a) Gluing 2 tetrahedra together. (b) A spherical 


(b) 


blob grows a trunk. 


What about the Einstein-Hilbert action? Let's try S = J Ryg * (e%e*), Again, the answer is easy to guess. We 
need a d-form to integrate over, involving 1 power of the curvature 2-form and no power of derivative. To obtain 
a d-form, we multiply the curvature 2-form by a (d — 2)-form: the only possibility in this context is the star of the 
2-form ee? , Again, between us friends, do you doubt this? Okay, let’s check (but, to save writing and to spare 
you the stream of indices, only for 4-dimensional spacetime): 


Rog * (eve) = (Rapuydx°dx") (<q V=getedx?dx") 
= Ry vay’ pi — dx dx dx? dx” = Ryyaye pe V— Bev’ d*x 
= Ryyoy girs” /—Bd*x = /=gd* xR (16) 


(using the generalization of (6) and again ignoring overall numerical factors), as expected. 


Appendix 3: Topological invariants: Euler characteristic, 
Gauss-Bonnet theorem, and all that 


We are now ready to look at some celebrated invariants in topology. Our discussion will be heuristic rather than 
rigorous, hitting some highlights rather than being exhaustive. I keep the discussion as elementary as possible. 

I suspect that many readers probably first encountered the Euler characteristic, like me, in a popular book of 
mathematics. For me, it was a real eye opener. First, let us look at the empirical data. The cube has 8 vertices or 
corners, 12 edges (4 on the top, 4 on the bottom, and 4 on the side), and 6 faces. Hence, V = 8, E = 12, F =6. 
Next, the tetrahedron has 4 vertices, 6 edges (3 on the bottom and 3 on the side), and 4 faces, that is, V = 4, 
E =6, F =4. We see that the combination, x = V — E + F, known as the Euler characteristic, is equal to 2 in 
both cases. (Note that in the sum, we add the number of geometrical entities, vertices, edges, and faces, with 
an alternating sign according to whether the dimension of the entity is even or odd.) The joke is that theoretical 
physicists would proclaim this to be a theorem at this point, but in fact it is easy to prove. Let me give a proof 
that would satisfy most physicists (but not mathematicians) and is suitable for elementary school children. 

Glue another tetrahedron to the tetrahedron we have, producing a 6 faced “diamond-shaped” object. See 
figure 1a. We gain 4 faces (the four of the second tetrahedron) but lose 2 faces (the two that are glued together). 
Thus, AF = +4 — 2 =2. Similarly, we can see that AV = +4 —3=1 and AE = +6 — 3= 3. Hence, we have 
Ax =AV —- AE+ AF =1-—3+42=0. Now we can glue zillions of these tetrahedra together to approximate 
any object we like. At every gluing, the Euler characteristic does not change. Hence x = 2 for any spherical- 
looking blob. The discussion makes clear that we are counting the V, EZ, and F on the surface. (The interior of 
the object, consisting of the faces we have glued together and their edges and vertices, has been “lost forever,” 
so to speak.) Throughout, we will be studying a surface or a 2-manifold, not the 3-dimensional blob enclosed by 
the surface. 

The Euler characteristic is evidently a topological quantity. We can lengthen and shorten the edges of the 
tetrahedra we are gluing together without changing V, E, and F. 
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Suppose we now go beyond deforming the spherical blob to changing its topology. We show presently that x 
“measures” the topology of the object. 

By gluing more tetradehedra on, we can make the spherical blob grow a trunk like that of an elephant. Let us 
slowly extend the trunk back toward another part of the blob, sort of like an elephant about to scratch its back 
with its trunk. The Euler characteristic x remains equal to 2, until we glue the tetrahedron at the tip of the trunk 
to a tetrahedron on the back of the elephant. See figure 1b. Now we lose 2 faces, 3 vertices, and 3 edges, and 
thus Ay = AV — AE+ AF =—3+3-—2=—2. The resulting surface has the topology of a torus and Euler 
characteristic y = 2—2=0. 

Growing another trunk and attaching it somewhere else causes the Euler characteristic x to decrease further, 
with Ax = —2 every time we do it. We have thus derived the general result y = V — E + F = 2(1— g), where g 
denotes the genus (g = 0 for the sphere, g = 1 for the torus, and so on). Some people call the genus the number 
of “holes.” (The torus is said to have 1 hole, but as we will soon see, in this context, the word “genus” or “handle” 
is preferable to the word “hole.”) The Euler characteristic x is manifestly a topological invariant, independent of 
the size or “shape” of the surface, and only dependent on its genus. 

A trivial generalization is to punctured surfaces. (In everyday parlance, a punctured sphere is a sphere with a 
hole in its surface, like a rapidly shrinking balloon with a hole in it. This is one reason why the use of the word 
“hole” for genus or handle, as indulged in by some, is ill advised.) Since we can puncture a surface by removing 
a triangular face from the surface (so that AV = 0, AE = 0, and AF = —1and hence Ax = —1), we have, more 
generally, 


X =2-2g-h (17) 


with h the number of holes or punctures in the surface.” 

Another way of proving (17) is to start with x = 2 for a spherical surface, and to punch any number of holes 
in it, thus obtaining x = 2 — h. Deforming the surface to bring 2 holes near each other and then to glue them 
together, we decrease h by 2 and increase the genus by 1. Hence we have (17). This derivation also makes clear 
the relative factor of 2 in the coefficients of g and h in x. 

We obtain what is known as a triangulation of the surface. (Indeed, that is what surveyors do: they triangulate 
the surface of the earth.) At the level of rigor of physics, any surface can be approximated to arbitrary accuracy 
by making the triangles small enough. In other words, we physicists would take the continuum limit without 
further ado. 

At the same level of rigor, we can also approximate spacetime by a large number of discrete elements. This 
represents the first step in a program to discretize Einstein gravity and to put it on the computer for numerical 
analysis.* 

That we used tetrahedra is not essential. Take the tetrahedron we started with. Call the vertices on the triangle 
on its “base” A, B, and C. Pick a point X on the edge joining A and B, and draw a line connecting X to the other 
vertex C. Then AV = 1, AE = 2, and AF = 1, and so Ax = 0. By “pulling” on the point X, we can deform the 
tetrahedron, if we feel like it, to a pyramid with a square base. As another example, we can pick a triangle on 
the surface we are studying and draw a line from one side of the triangle to another side, so that we divide the 
triangle into a smaller triangle and a quadrilateral. In the process, AV = 2, AE = 3, and AF = 1, and so again 
Ax = 0. You can make up your own “moves” and show that instead of triangles, the surface could be composed 
of polygons with any number of sides you like. 

Indeed, Descartes had already published, in his progymnasmata to the study of solids, a theorem on angular 
deficits that foreshadowed the Euler characteristic. Here is what Descartes said. At each vertex of a cube, 3 squares 
meet. The 3 angles at the vertex add up to 3(57) a 3m. The amount by which this is less than 27 is known as 


the angular deficit, in this case equal to (2 — 3x = 57. The total angular deficit, namely the sum of the angular 
deficits at each of the 8 vertices, is then equal to 8( 3) = 42. Descartes stated that the total angular deficit of any 
polyhedron topologically equivalent to the sphere is equal to 47. 

Let’s try it for a tetrahedron. At each vertex, 3 equilateral triangles meet, with angles adding up to 3(4) =z, 
so that the deficit equals (2 — 1) =. There are 4 vertices, and so the total deficit is indeed 477. We have verified 
2 cases, so it is surely a theorem. We are proud physicists, but still, perhaps a third example would be good. 

So consider the dodecahedron with 12 pentagonal faces. Since there are E = (12 x 5)/2 = 30 edges, according 
to Euler’s theorem, the number of vertices equals V = 2+ E — F =2+ 30—12= 20. The 12 pentagons have 
12 x 5 = 60 vertices, and hence 60/20 = 3 pentagons meet at each vertex. For a regular polygon with n sides, 
the angle a at each vertex is given by na + 27 = nm (to see this, divide the polygon into n triangles), that 


* Lattice gravity is a thriving subject of research. Historically, this first step is known as the Regge calculus. 
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is, a = (n — 2)z/n. In particular, for a pentagon, we have a = om: The angular deficit at each vertex is then 
(2 - 3(2))x = $x. Thus, the total angular deficit equals to 20 x $2 = 4s. Descartes was right! 


Let’s now show that the total angular deficit is topologically invariant. Consider some triangulated surface 
as described earlier, and visualize it as a wire framework. (For ease of presentation, let the faces be triangles. 
Thus, for the cube mentioned above, simply divide the squares by their diagonals.) Pick an edge AB, shared by 
two triangles ABC, and ABC). Call the 6 angles contained in these 2 triangles aj, Bj, Vj, with j = 1, 2, using 
an almost self-evident notation (for example, a is the angle of ABC, at the vertex A, y, is the angle of ABC, at 
the vertex C;, and so on). Now imagine lengthening the edge AB slightly, thus increasing y; and decreasing a; 
and f;. Thus, the change in the angular deficit at the vertex A is — (5a + 5a). But since the angles in a triangle 
have to add up toz, d(aj + Bj + yj) =0. The angular deficit at the vertices A, B, C,, and C, all vary, but the total 
angular deficit stays the same: it is a topological invariant, as Descartes taught us. 

Here is the previous proof dressed up to make it look more sophisticated. Label the vertices by p. At vertex 
p, a set s(p) of triangles meet. The sum of the angles meeting at vertex p is then 7; <,(,) &;, with a; the angle 


extended by the ith triangle at that vertex. The total angular deficit is then )> »(2x -> és(p) ai). Let us now 
deform the surface infinitesimally by lengthening or shortening each of the zillions of edges. The variation in 
the total angular deficit is equal to — }°,, )Vies(p) 5i- Rearranging to sum over 1 triangle at a time, we see that 
this vanishes, since the sum of the angles in each triangle is constrained to add up to z. 

With this background explanation of what the Euler characteristic x is, we now return to the subject of this 
appendix and of this chapter. Can we write x as an integral? In other words, how do we calculate x, originally 
defined to be V — E + F, in the continuum limit where V, E, and F are not defined? 

We are given a closed surface, that is, a 2-manifold M without boundary (in other words, we are setting h = 0 
for simplicity). What is a 2-form that we could integrate over M? The curvature 2-form R°? comes to mind; so 
let’s try fiy Eup R°®, (It is also instructive to try the other possibility: e“e?. It turns out that fy, Eop e%e? is the area 
of the surface, as you might have guessed from the fact that it does not contain any derivative.) We first work out 
the integrand in terms of a more elementary notation: 


Egg RP = fap ROR dx"dx” = eqpenes RP? dxMdx" 
= (det €)6 pq ROG 8d x = 2d7x,/gd45 ROS, = d?x./ER (18) 


(In the next to last step, we used (6).) 

We have known, for quite a while now, that for a sphere of radius a, the scalar curvature R = 2/ a2. The area 
is fy d*x./g = 47a’. Thus, the radius cancels out in the integral tu Eup R*®=2 Jig d*x./@R, and this integral 
is indeed, as we might suspect, topological in character, equal to some constant like 1677. 

The scalar curvature of a torus was calculated back in exercise 1.6.2 and again in exercise 1.5.16; it assumes 
both positive and negative values. I will let you check that the integral gives 0 for a torus. We claim that, up to 
some overall factor, this integral gives the Euler characteristic x. 

In fact, Descartes’ theorem is just the statement, up to some overall constant, that the Euler characteristic x 
equals 2 for a surface with the topology of the sphere. Go back to the spherical-looking blob. Under the microscope, 
we see that the surface is formed out of zillions of triangles. Inside any triangle, we have a flat surface; the surface 
curvature is concentrated on the edges and at the vertices. Indeed, the angular deficit measures our intuitive 
understanding of curvature: that which we cannot iron flat is curvature. The smaller the angular deficit is at a 
vertex, the less that vertex sticks out. When the angular deficit vanishes, the surface around that vertex is flat. 
The angular deficit “measures” the curvature. 

The generalization to a 4-manifold M almost immediately suggests itself. Let’s parallel the discussion embod- 
ied in (18) and integrate the 4-form é,,,5 R°? RY® over M. As in the discussion above, the characteristic length a 
of M will cancel out in the integral fy, &ag5R°? R”°. I will let you have the fun of working this out in an exercise. 
That this integral is a topological invariant® is known as the Gauss-Bonnet theorem in the physics literature. 

In the interest of keeping this appendix to a manageable size, I have not proven why the 2 integrals mentioned 
here are topological invariant,? but instead, have “merely” shown you why they must be so (in the spirit of what 
the American Mathematical Society said about my field theory textbook, as quoted in a footnote in chapter V.6). 


Exercises 


1 Check Descartes’ theorem for the icosahedron, constructed out of 20 equilateral triangular faces. Hint: Use 
Euler’s theorem. 
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2 Show that the integral f,, d2x,/gR vanishes for a torus. 
3 Just as in (18), evaluate the 4-form éyg,, sR RY? in a more elementary notation. 
Notes 


1. The first appearance of this term in the physics literature, as far as I know, is in S. Deser, R. Jackiw, and 


S. Templeton, Phys. Rev. Lett. 48 (1982), pp. 975-978. 


2. It is implicitly assumed here that such a description is in fact appropriate. 
3. Here we implicitly assume that the microscopic physics produces only one length scale /. 
4. For a glimpse of this fascinating subject, the interested reader is referred to QFT Nut, chapter VI.2. He or 


she could then move on to more specialized monographs. 


. See, for example, A. Zee, “Quantum Hall Fluids,” in Field Theory, Topology and Condensed Matter Physics, 


ed. H. B. Geyer, Springer, 1994, with references to the original literature. 


. For further discussion of topological effects in quantum Hall fluids, see the reference in endnote 5. In 


particular, a topological quantity called the shift was introduced in X. G. Wen and A. Zee, Phys. Rev. Lett. 69 
(1992), p. 953, 3600(E). 


. It turns out that this theorem about punctured surfaces is useful in studying RNA folding. See M. Bon, 


G. Vernizzi, H. Orland, and A. Zee, J. Mol. Biol. 379 (2008), p. 900. 


. Thus, this integral can be added to the Hilbert-Einstein action and the resulting action studied. See, for 


example, I. Low and A. Zee, Nucl. Phys. B585 (2000), p. 395; P. Binétruy, C. Charmousis, S. Davis, J. F. 
Dufaux, Phys. Lett. B 544 (2002), p. 185. 


. For the reader eager for a proof, I give a few hints on how to produce a proof, sketched in the briefest possible 


way. The essential physics (and mathematics) goes back to Faraday’s entirely intuitive picture of magnetic 
flux lines and their conservation. Consider the magnetic flux (or electric flux for that matter) going through a 
surface A with boundary C, namely f A da-B , with da an infinitesimal area element. Now distort the surface 
A toa surface A’ with the same boundary C. Then Faraday tells us that fy da - B= Jy da- B. Equivalently, 
write 0 = fa a Jada B= Ss da - B, where in the last expression S = A — A’ denotes the closed surface 
enclosed by A and —A’. For example, S could be the 2-sphere S$’, with A and —A’ its northern or southern 
hemisphere, respectively, and C the equator. What we just said is simply the elementary fact that the magnetic 
flux enclosed by S? vanishes. Now imagine a magnetic monopole sitting inside S*. Then the magnetic flux 
enclosed by S* would be equal to 1 in some suitable units. But the conservation of magnetic flux lines now 
tells us that we could distort S* to any surface S with the topology of the sphere, and as long as S encloses 
the magnetic monopole, the total flux f; da - B will continue to be equal to 1 regardless of the shape of S. 
Ina sense, this is the first hint that topology is relevant to physics. We can deform the surface S, up to some 
limit, without changing the total flux S encloses, be it equal to 0, 1, or some other value. 

In the language of forms, the preceding discussion is intimately related to what we touched upon in 
appendix 2 to chapter IX.7. The electromagnetic 2-form F is closed, that is, dF = 0 (corresponding to 
magnetic flux conservation), but F is only locally—but not globally—exact, that is, F = dA only locally, 
not globally. Otherwise, according to (IX.7.24), { F would vanish for any surface without a boundary, such 
as the sphere $7. The integral f', F can be nonzero precisely because, under some circumstances, we cannot 
define an electromagnetic 1-form A over the entire S, but have to divide S into overlapping “patches.” These 
considerations led Dirac to conclude that /, F must be quantized to take on only integer values, just like the 
Euler characteristic. For details of this argument, which I do not have room to go into here, see, for example, 
QFT Nut, p. 248. To show that the 2 integrals discussed in this appendix are also quantized to take on integer 
values, we follow essentially the same steps (with d replaced by the covariant D). 


xX ‘ 6 A Brief Introduction to Twistors 


Twistors 


Here I introduce you to twistors. After lying dormant for decades, twistors have recently 
returned to fundamental physics amid tremendous excitement.! Introductory texts on 
Einstein gravity do not normally cover twistors, but I cannot resist giving readers who 
have gotten this far at least a flavor of what the recent excitement is about. Actually, you 
are well equipped, as you will see, by the discussion in, for example, chapters III.3 and 
VIL.2, to embark on a journey into twistor space. We won’t get very far, but my hope is that 
this brief introduction will inspire you to venture deeper into this beautiful subject. 

In the following, we will need a few concepts that some readers may be unfamiliar with. 
For the benefit of these readers, I will collect these topics in appendix 1, which you might 
want to read first before going on. And of course, if you find these concepts too alien, you 
could simply skip this chapter. 

Also, while I find the mathematical foundation of twistors fascinating and beautiful, 
here I adopt a down-to-earth and pedestrian approach, dealing with twistors entirely at the 
“arithmetical level” that most theoretical physicists favor, without inessential mathematical 
embellishments. I provide the motivation for studying twistors as we move along. (I prefer 
to avoid the common practice of some writers telling the reader what something is good 
for before the reader has any idea what that something is.) 

The discussion here is restricted to flat spacetime. 


Covering the Lorentz group 


Given four real numbers p“ = (p°, p', p?, p*) (which we can regard as the momentum 
of a particle), consider the matrix 


gh ge oxpl tas? 
Pose = aes 04 »3 (1) 
—p —'tp P+?p _ 
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Here the two indices a and @ run over 1, 2. Notice that the matrix p is hermitean and 
thus can be written as a linear combination of the matrices 04: pyg = —Py(o")ag = 
(p°I — p'o')yq (see appendix 1). Clearly, there is an unavoidable notational overload: the 
single letter p denotes both the vector and the matrix, but you should be able to tell from 
the context which object is being referred to. 


By inspection, we see that the determinant of (1), det p = (p®)* — (p!)* — (p*)* 
(p>)? = —Nyyp" p” = —p + p, is just the Minkowskian square of the 4-momentum p. 

Let us now indulge in a few steps of elementary linear algebra. Let L denote an arbi- 
trary 2-by-2 complex matrix with determinant equal to 1. Assuming that L, and L, are 
two such matrices, the product L,L, is also a 2-by-2 complex matrix with det(L,L,) = 
(det L,)(det Ly) = 1. Thus, the set of all such matrices form a group known? as SL(2, C) 
to the cognoscenti, with the letters indicating that this is the special linear group of 2-by-2 
matrices over the complex numbers. 


Given the matrix p, let us consider 
p'=L'pL (2) 


for some element L of SL(2, C). Manifestly, p’ is also hermitean, since (p’)t = (LT pL) = 
Lip'L =LtpL =p’ and thus can be written as p! = (p’J — po‘). In this way, an 
element L defines a transformation on 4-vectors, taking p into p’. Now observe that 
det p’ = (det Lt)(det p)(det L) = det p, or in other words, the transformation preserves 
the Minkowskian square of the 4-momentum: p” = p?. As you might have expected, it is 
a Lorentz transformation. 

This shows that an element L of SL(2, C) corresponds to an element A(L) of the 
Lorentz group SO(3, 1). However, it is not a 1-to-1 map, since L and —L give the same 
transformation p > p’. Mathematicians say that SL(2, C) double covers SO(3, 1). Fur- 
thermore, if L is also unitary, that is, L’ = L~!, then from (2) we have p’ = p®, and the 
transformation is a rotation. In other words, the SU (2) subgroup of SL(2, C) double cov- 
ers the rotation subgroup SO(3) of the Lorentz group SO(3, 1). (Readers familiar with 
quantum mechanics will recognize that here we are extending and generalizing the stan- 
dard discussion of how spin 3 particles transform under rotation.) If L is not unitary, we 
have p’ # p®, and the transformation involves a Lorentz boost. (For more details, see 
appendix 1.) 


Penrose and the twistor 


Penrose, in inventing twistors, was motivated by the thought that, in a general spacetime, 
lightlike or null lines traced by light might be more fundamental than points. Given your 
familiarity with Penrose diagrams by now, you will not be surprised that this is one and the 
same Roger Penrose, who incidentally has also authored a number of well-known popular 
books. After all, Penrose diagrams emphasize the causal web constructed out of null lines 
between various events in spacetime. In this chapter, we restrict ourselves to Minkowski 
spacetime (as I’ve already mentioned). 
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So let’s consider a lightlike vector p. The preceding discussion simplifies enormously! 
We have det p = — p* = 0, and hence the matrix p generically has one zero eigenvalue. (In 
fancy talk, the matrix has rank 1 rather than 2.) From elementary linear algebra, we recall 
that a 2-by-2 matrix m of rank 1 can always be written as m;; = v;w,, with v and w two 2- 
component vectors (since the vector orthogonal to w provides the zero eigenvector). Thus, 


for a lightlike vector p”, we can write 
F (3) 


The two 2-component objects 4 and A are sometimes called helicity spinors. (Our friend 
the Jargon Guy is beside himself with joy in this chapter.) 

Upon first exposure, the formalism appears quite opaque, but actually, like a lot of 
formalisms, it is fairly simple or perhaps even trivial. If you are confused at any point in 
the following exposition, just work things out explicitly. For example, consider a physical 
momentum with p® = E > 0. With no loss of generality, call the direction of p the third axis, 


so that (with a trivial abuse of notation) p = ( rae oe 


to the rank 1 matrix p=2E ( 7 =2E ({) (0 1). Thus, in this case, 4 and A are both 


)i which for p lightlike collapses 


equal to /2E ( numerically. (To make sure you get it, work this out for p pointing in 
some other direction.) 


You can think of the Pauli spinors A and 2. as the* “ 


square root” of the Lorentz vector p”. 

Another motivation for studying twistors (note that I haven't told you what they are 
yet!) comes from particle physics, specifically, quantum field theory. Fear not, the only 
knowledge of quantum field theory I ask of you is minimal. First, just as in quantum 
mechanics, one of the tasks of quantum field theory is to calculate the scattering amplitude 
M (Pp, P2, P3,°**> Pn) involving particles with momentum p,. (Here we have written the 
amplitude for a process of the form p, + pz > p3+--+-+ py.) The second thing I need you 
to know is that when we quantize the electromagnetic field, we obtain photons,! and when 
we quantize the gravitational field, we obtain gravitons, something I already mentioned 
back in chapter IX.4. 

In our titanic struggle to tame quantum gravity (see chapter X.8), one (fairly down-to- 
earth) approach is to study the scattering of gravitons off each other and see what happens. 
Gravitons are of course, just like photons, massless, and they carry null momentum, so that 
in the scattering amplitude M(~, p2, p3,°**» Pn), the momenta p,, fora =1,---,n, are 
all null. Henceforth, we simply write the amplitude as M(p,), and in fact, even as M(p). 
This standard abuse of notation is not as distasteful as you might think, as we will be 
focusing on one specific momentum at a time. The preceding discussion indicates that 
we can also write the amplitude as M(A,, 4,), or more compactly, M(A, A). 


* Some sophisticated readers might realize that this rather nontrivial possibility of taking a square root of a 
vector is foreordained by the structure of the Lorentz group. In a sense, this represents Dirac’s great discovery. 
See QFT Nut, chapter II.3, for example. 

t Einstein’s Nobel Prize for the photoelectric effect! 
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Complexification, or two times 


Since momentum is characterized by 4 real numbers p”, the matrix pyg = —Py(O ag 
is hermitean (indeed, that was how we started this chapter), which implies that 4 = A* is 
the complex conjugate of 4. The spinor A is not independent of A. As students of physics, 
we know that constraints are, generically, bad news. Theorists (and mathematicians) need 
more freedom! We prefer to keep the variables A and A in M(A,, A.) independent of each 
other. 

Our esteemed experimentalist friends insist that the momentum components p” must 
be real, and they are absolutely right, of course. Theorists, on the other hand, are free* 
to analytically continue the variables in the scattering amplitude M(py, p2, p3,°°-, Py) to 
complex values. As you learned in a course on complex analysis, to evaluate an integral 
over a real variable, it is often useful to analytically continue the integrand into the complex 
plane and use Cauchy’s theorem. We are proceeding in the same spirit here. It is known in 
quantum field theory (and in quantum mechanics) that scattering amplitudes are analytic 
functions of their kinematic variables.? Thus, quantum field theorists often continue 
analytically without a moment’s thought. Theoretical physicists love that guy Cauchy! Of 
course, all momenta in the scattering amplitude are to be set back to reality at the end of 
the calculation. 

I invite you to verify that the discussion in this chapter thus far goes through even if p” 
are complex, so that A and A are no longer yoked to each other. 

An alternative approach is to change the signature of spacetime, from (— + ++) to 
(— — ++), so that, instead of the Lorentz group SO (3, 1), we consider the group SO(2, 2). 
(I mentioned the groups SO(m, n) as far back as chapter III.3.) As Minkowski already 
noted in his famous paper referred to in chapter III.3, it is a simple matter of removing 
(or adding) an i here and there. Let us strip the Pauli matrix o? (kind of a troublemaker or 
at least an odd man out) of its i and define (for our purposes here) ¢? = ( e ). Any real 
2-by-2 matrix p (this matrix is of course to be distinguished from the matrix p in (1)) can 
be decomposed as 


with (p!, p?, p?, p*) four real numbers. Now we have det p = (p*)* + (p*)* — (p3)* — 
(p!)?. Instead of SL(2, C), consider SL(2, R), consisting of all 2-by-2 real matrices with 
unit determinant. For any two elements L; and L, of this group, transform p —> p’ = 
L|p(L,)’. Evidently, det p’ = (det L,)(det p)(det L,) = det p. Thus, the transformation 


preserves the quadratic invariant (p*)* + (p?)* — (p?)? — (p)*. This shows explicitly that 


* What we are doing in this chapter is following the three ways of the warrior theorist; see QFT Nut, p. 522. 


X.6. A Brief Introduction to Twistors | 733 


the group SO(2, 2) is locally isomorphic to SL(2, R) ® SL(2, R), where the two factors of 
SL(2, R) reflect the fact that L; and L, can be chosen independently of each other. 

For a null SO(2, 2) vector p, that is, a real 4-vector such that (p*)? + (p)? — (p3)* — 
(p')* = 0, we can write pyg = AyAq, With 4 and 4 two independent real spinors. Indeed, A 
and A transform independently, according to 


SN as (4) 


a 


hog (Ea)? hep: ahd Hy 


a 


Incidentally, we are changing the signature of spacetime in the same spirit as complexify- 
ing a manifestly real variable. At the end of the calculation, the signature is to be switched 
back to the physical signature. Nobody is suggesting that we live in a spacetime with two 
time dimensions and two space dimensions. 

Both approaches, complexifying momentum and changing signature, are used in the 
literature. We will jump back and forth between the two approaches. 


Freedom to rescale 


You learned in school that the ordinary square root has a sign ambiguity. Analogously, in 
(3), p does not determine A and A uniquely. We can always rescale 


Aor and ASA (5) 


for any complex number ft. (You might have wondered what fixed the overall constant in A 
and A in the simple explicit example above; I made an arbitrary choice.) This freedom to 
rescale will play an important role. 

By the way, for real momentum, 4 = A*, and so the rescaling parameter t is restricted to 
be a phase factor e’”. In this case, the condition that p has rank 1 allows for two solutions: 


Pu = +AgAq, With the two possible signs corresponding to whether p° is positive or 
negative. With the SO(2, 2) signature, the rescaling parameter t is restricted to be a real 
number. 

It is instructive to count the number of real degrees of freedom for these two different 
approaches. 

A complex lightlike momentum depends on 4 x 2— 2=6 real numbers, since the 
condition p* = 0 now amounts to two real conditions, while 4 and A each contain 2 complex 
numbers. But with rescaling, we are left with 2 x 2 — 1= 3 complex numbers, thatis, 6 real 
numbers. 

A real lightlike momentum depends on 4 — 1= 3 real numbers, while 4 and 4 each 
contain 2 complex numbers. But now they are tied to each other, so altogether, they contain 
2 complex numbers, which get reduced to 3 real numbers after rescaling by a phase factor. 

On the other hand, for a (real) lightlike vector transforming under SO (2, 2), we have 2 
real spinors, which after rescaling contain 2 x 2 — 1=3 real numbers. So it all works out, 
of course. 
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Lorentz invariance 


In terms of helicity spinors, Lorentz invariance takes on a particularly simple form. 
Under a Lorentz transformation, 4 > LA, with L an arbitrary 2-by-2 complex matrix 
with determinant equal to 1, that is, an element of SL(2, C). Back in chapter III.3, given 
2 vectors p and q, we constructed the Lorentz invariant quantity p -q = n,,,p"q". Given 
2 helicity spinors A and yx, what is the Lorentz invariant quantity we can construct out 
of them? 

Once we realize that the only property of L that we have to work with is its unit 
determinant, the answer becomes clear. Define 


(A, w) = 6" Agi = —(u, A) (6) 


with the antisymmetric symbol ¢!* = —e*! = —1, ¢!! = e?? = 0. Under a Lorentz transfor- 
mation, we have e%A,,43 > pL LE Ai tegr = (det L)e*P Ag pr = 6%? Agi [egr, Where 
we have used the definition of the determinant (as mentioned in chapter X.5). The quantity 
(A, #) is manifestly Lorentz invariant. 

Similarly, given A and ji, we have the Lorentz invariant 


B, wl=ePigfig =—[u, A] (7) 


(A trivial notational remark: You might be inclined to write [A, ji], but then the twiddles 
are redundant. The square bracket is defined only for twiddled spinors.) 

In contracting helicity spinors, the antisymmetric symbol plays the role as the metric 
Nv in contracting vectors and tensors. In parallel with the discussion in chapter III.3, we 
are clearly invited to define helicity spinors with an upper index according to 1% = e*? wg. 
Then the invariant (A, j) can be written as A, u%. 


Polarization and helicity 


You know that an electromagnetic wave has two polarizations. After quantization, the 
resulting photon has two helicity states labeled by +1 or —1: it can spin either clockwise or 
counterclockwise around the direction of its 3-momentum. As discussed in chapter IX.4, 
the situation in gravity is entirely analogous. A gravitational wave has two polarizations, 
and the graviton has two helicity states labeled by +2 or —2. (The photon has spin 1, while 
the graviton has spin 2, a fact that can be traced back to A,, and g,,,, carrying one and two 
indices, respectively.) Thus, in talking about the graviton scattering amplitude, I have to 
specify the helicity of each graviton and write M(py, hy, p2, hz, p3,h3,°++, Pa, Wn) with 


ha = +2 fora =1, ce 

Now I have to tell you something about the scattering amplitude M(Aq, Ay, hg) when 
expressed in terms of helicity spinors. At this point, I appeal to your knowledge of quantum 
mechanics. Take a quantum state with angular momentum h around some axis. Rotate it 
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around that axis through an angle €. Then the quantum state acquires a phase e!"* = 17", 
where ¢ = e2%. 

Let’s focus on a specific particle and omit the subscript a, writing simply M(A, A, h). 
Imagine rotating the quantum state of that specific particle around the direction of its 
momentum p through an angle €. Referring to appendix 1, we see that under this rotation, 
4 t7"and A = th. (This is a particular case of the rescaling expressed in (5).) Note that 
the momentum p = AA is left unchanged, as it better be, since we are rotating using p as 
the rotation axis. What I just told you about quantum mechanics states that the scattering 
amplitude must satisfy 


M (m2, th, h) = Py (2, ie i) (8) 


Keep in mind the suppressed subscript a. By analytic continuation, we argue that this 
scaling property should hold for arbitrary t and will serve to severely restrict the scattering 
amplitude. Under rotation, A and 4 transform oppositely, and so what we are doing is 
simply counting the powers of 4 minus the powers of A in the scattering amplitude.* 


Power of helicity spinors 


After talking about scattering amplitudes for all this time, I regret to inform you that 
we can’t actually calculate one. That’s kind of a bummer, but to calculate a scattering 
amplitude, you would have to learn field theoretic methods, such as Feynman diagrams 
(just as, in nonrelativistic quantum mechanics, to calculate a scattering amplitude you 
would have to master stuff like perturbation theory), which are way beyond the scope of 
this book. However, I can impress upon you the power of the helicity spinor formalism. 

Instead of graviton scattering, let’s talk about the far simpler case of gluon scattering. 
I already mentioned, in chapter X.1 for example, that the strong interaction is described 
by a Yang-Mills theory, with the gluon playing the role of the photon. Consider two gluons 
scattering, ending up with 3 gluons (in the notation used earlier, this is described by 
Pi + P2—> p3+ pat ps). We refer to this as 5-gluon scattering. Suppose you want to 
calculate this to lowest order in perturbation, in the simplest possible case (for example, 
without any quarks around). If you used the traditional Feynman diagram method, the 
result contains something on the order of 7,000 terms.’ When the result is expressed in 
terms of helicity spinors, the scattering amplitude, for a particular choice of helicities, 
simplifies dramatically to 


M(1,2°,3+,47,5*)= aay 5 ar r (9) 
mee ney (12) (23) (34) (45) (51) ees 


a=1 


* This sounds more mysterious than it actually is. The reason for that is because, for fear of confusing some 
readers, I have not digressed into a discussion of how to write polarization vectors in terms of helicity spinors. 
See QFT Nut, pp. 489 and 493. 

¥ Part of the result is shown as a black smudge on p. 484 of QFT Nut. 
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Here I use the compact notation favored in the research literature of writing M(---, 
Par Mgr) as M(-++, a, -+-) and (A,, A,) as (ab). The delta function specifies that 
Py _1 Pa = 0 and hence momentum conservation.* 

Behold, several thousand terms have collapsed into one single term! I trust that you 
are impressed. The point here is not how this amplitude is derived, but how it simplifies 
drastically when expressed in terms of “correct” variables. 

While I cannot derive* (9) for you, I can point out that this remarkable expression 
satisfies all our invariance requirements. Lorentz invariance is satisfied, since (ab), as de- 
fined in (6), is a Lorentz scalar. Let’s check the scaling requirement (8). Letting 13 > 
t~1A3, we have M(1~, 27, 3+, 44+, 5+) > #2M(1-, 27, 3+, 4%, 5+). In contrast, letting 
dy > t71A4, we have M — t~4t?M =1t7?M. You could see that scaling severely restricts 
M. Scaling and Lorentz invariance almost fix M uniquely. 


The ambitwistor representation 


After these many pages, I still haven't told you what a twistor is. I needed to set up helicity 
spinors first. Finally, we are ready to build twistors out of helicity spinors. 

Consider a scattering amplitude M and again focus on the particle a. Write M(A,, Aq), 
suppressing the dependence on the other particles. Let us Fourier transform M in two 
possible ways (and overuse the letter M somewhat): 


M (W,) = / diy exp (ijt hag) M ( ia) (10) 
and 
M (Z,) = f dh, exp (inihaa) M (aa ia) (11) 


We have defined two 4-component objects (suppressing the subscript a): 


a) 
w=(_ ) and z=( “ (12) 
den ue 


The intent here is to transform M sequentially for a = 1, 2,---,n, using either (10) or 
(11). Consider SO (2, 2) here instead of SO (3, 1), so that the spinors A and 4 are real, and 
hence we can take x and jz to be real as well. Thus, these integral transforms are no more 
and no less than the Fourier transforms you have long been familiar with, and the variable 
uw is conjugate to the variable 1 in the same sense that p is conjugate to g in quantum 
mechanics. The objects W and Z are known as a dual twistor and a twistor, respectively.” 

What is the point of Fourier transforming and packaging 2-component objects into 4- 
component objects? One advantage is that the scaling requirement (8) comes out nicer. 
Instead of ’ and A scaling oppositely, we now have, thanks to Mr. Fourier, 4 and jz scaling 
the same way. Similarly for the pair (ji, A). 


* To write momentum conservation in this form, I have reversed the signs of p, and pp. I have also omitted 
mentioning various quantum field theoretic notions and technicalities, such as crossing and color stripping. 


X.6. A Brief Introduction to Twistors | 737 
We find easily that 
M(tZ,h) = / dh exp (irui) M (12, i. i) =? 1 di! exp (ini’) M (a, tH, h) 
= 170+) M(Z, h) (13) 


(displaying the helicity h of particle i while suppressing the index a). We used elementary 
calculus in the second equality and (8) in the third equality. Similarly, 


M(tW, h) = if d?d. exp (itith) M (2,14, n) a1? f ay exp (iti!) M (1-12, 1, h) 
= 1°") MW, h) (14) 


This scaling result indicates that we should favor a mixed or ambitwistor representation 
for the scattering amplitude, using W when the particle carries + helicity and Z when the 


particle carries — helicity. (In particular, for gluons, h = +1, and so we have M(tW, +) = 
M(W, +) and M(tZ, —) = M(Z, —). We return to this remarkable result in a minute.) 


SL(4, R) suddenly appears 


Another advantage of the twistor formalism is that these objects (12) with 4 real com- 
ponents (we are still sticking to SO(2, 2) for the moment) naturally invite us to consider 
transformation under the special linear group SL(4, R) over real numbers. In other words, 
transform Z — LZ with £ a real 4-by-4 matrix with real entries and det £ = 1. 

But wait! The physics we started out with is supposed to be invariant under SO(2, 2) = 
SL(2, R) ® SL(2, R), evidently a subgroup of SL(4, R). Indeed, from (4), we have 


how (L) £ 0 1p 
Fah Ps (15) 
ue 0 (Le), ) \ uP 


where (Le), = (Ly, = (ES) ; (see appendix 1). (The reader struggling with 
this material should not be overly concerned with the ¢ and e~!; we merely have to raise 
and lower some spinor indices.) What is important here is that the 4-by-4 matrices in the 
subgroup SO (2, 2) are constructed by placing 2-by-2 blocks along the diagonal. 

Similarly, W transforms under SL(4, R). Indeed, as indicated by the indices matching 
up, the product W - Z = fA, + Ayu is invariant under SL(4, R). 

Given more than one W and Z, we also have the Lorentz invariants Z,1 Z, =< 4, A2> 
and W,I W, = [Ay, 2]. (Here J, ina slightly abused notation used in the literature, evidently 
denotes either the 4-by-4 matrix ( : >) or ( ; oh depending on whether it acts on W or 
Z.) Note that these two quantities Z,/Z, and W,] W, are not SL(4, R) invariant. 

What is this mysterious group SL(4, R) that contains the “Lorentz” group SL(2, R) @ 
SL(2, R)? I will let you figure it out. Here is a hint: Count the number of generators. The 
group SL(4, R) has 4*— 1= 15 generators, while SL(2, R) @ SL(2, R) has 2(2? — 1) =6 
generators. What could the remaining 15 — 6 = 9 generators possibly be? Do think for a 
while. 
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Power of the ambitwistor 


I now show you the power of the ambitwistor. Consider the 4-gluon scattering amplitude 
M(1*, 2~, 3, 4) to lowest order, the calculation of which using the traditional Feynman 
diagram method can be done by hand but still involves about 100 terms. As explained 
above, we should use the variable W for particles 1 and 3, and Z for 2 and 4. 

Now apply the remarkable result M(tW, +) = M(W, +) and M(tZ, —) = M(Z, —) we 
derived following (13) and (14). We have 


M (tWy, Z;,W;,Z,;) =M (Wi ,tZ,,W;, Z,)=M (WY, Z,,tWy, Z;) 


=M (Wi, 27, Wy. tZ,)=M (Wit Zz, WE Z) 


Naively, it would appear that M(W,', Z,, W;', Z,) does not depend on Wj, Zz, W3, 
and Z, at all. We are tempted to conclude that, in the ambitwistor representation, this 
scattering amplitude is, up to an irrelevant overall constant, just 1! Not so fast, though. It 
could also be —1. The sign depends on which kinematic regime we are in. More carefully, 
we conclude that 


M (W;, Z;, W;', Z,) =sign (W,- Z2) sign (Z,- W3) sign (W3-Z,) sign (Z4- W;) (16) 


As an exercise, you can Fourier transform back to the 4 and 4 representation. 

The result (16) is truly amazing: it tells us that the 4-gluon scattering amplitude, when 
written in appropriate variables, is just equal to +1 or —1, depending on the kinematic 
regime. The 100 or so terms in the Feynman approach, alluded to above, are struggling 


to tell us that they will sum up to +1 when we translate everything into the language of 
twistors. 


Interaction among gravitons 


I hope that by these examples, I have convinced you plenty that the traditional Feynman 
diagram approach is almost hopeless when it comes to gluon scattering. The situation with 
gravity is far worse. 

Back in chapter IX.5, I mentioned that if we plug g,,, = ,, +/,,» into the Einstein- 
Hilbert action and expand to O(h3), we obtain cubic terms of the form hdhdh, with 
indices suppressed. Since there are 8 indices contracted every which way, the schematic 
form hdhdh actually contains many terms. I also explained that the infinite number 
of terms of the form h---hdhdh describe the complicated interaction of many gravi- 
tons with one another. In the traditional Feynman approach in quantum field theory, 
one Fourier transforms hdhdh to momentum space to obtain the interaction amplitude 
M (py, hy, Pr, ho, p3, 3) with (pz, h,) the momentum and helicity of the 3 interacting 
gravitons. Take my word for it, the whole thing is a horrible mess. 

What does the basic cubic vertex for gravity come out to be in the ambitwistor rep- 


resentation? Well, the scaling relations (13) and (14) tell us that, for h = +2, we have 
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M(tW, ++) =t?M(W, ++) and M(tZ, —-—) =t*M(Z, ——). Thus M(Z,_, Zy~, W3'*) 
must be quadratic in Z,, in Z, and in W3. The only possibility° is 


M (Z,~, Z,~, Wz*) =| (Za Ws) (Za Ws) (Zit Za) | (17) 


This amplitude describes the basic interaction of gravitons with one another. (If you are 
not that impressed, it is because you have never dealt with the mess referred to in the 
preceding paragraph.) In other words, this cubic vertex, as expressed in the language of 
twistors in (17), embodies the Einstein-Hilbert action, and thus, in some sense, provides 
a compact summary of this entire book. 

Another extragalactic fable suggests itself. In some other civilization, after the discovery 
of special relativity, some mathematically inclined physicist could have written Lorentz 
vectors in terms of helicity spinors and then constructed twistors out of them. Another 
theorist showed that a massless spin 2 particle generates the inverse square law of gravity. 
The cubic vertex for 3 interacting gravitons (17) could then be written down, and then 
Fourier transformed back to an expression involving helicity spinors. Expressing this in 
terms of momentum and then Fourier transforming to spacetime, some bright young guy 
could have discovered Einstein gravity (and then Riemannian geometry while he or she 
was at it) via this route! 

By the way, did you figure out what the group SL(4, R) is? Ifyou didn’t, you should have 
remembered chapter IX.9. Its 9 extra generators not in the Lorentz algebra of SL(2, R) ® 
SL(2, R) describe 4 translations, 4 conformal transformations, and 1 dilation, correspond- 
ing to 


0 0 0 xX I 0 
xX 0 0 0 0 -I 


respectively. (Here X denotes the 4 linearly 2-by-2 matrices, including the identity.) 


Where is spacetime? 


In our discussion, we approached twistors by a purely utilitarian approach. We express the 
physics in terms of ever shinier and better variables, from p“ to py, to A and A, and then 
to W and Z.* In this pedestrian approach, the beautiful geometric essence’ of twistors is 
completely obscured. 

We have been acting mostly like particle physicists, talking about scattering amplitudes 
and living happily in momentum space. But where is the spacetime we know and love 
hiding in these scattering amplitudes? 


* In the literature, people have gone one step further to supertwistors W and Z by adjoining Grassmannian 
variables. This subject naturally invites the inclusion of supersymmetry, upon which it becomes, perhaps not 
surprisingly, even more elegant and compact. 

+ The geometric origin of twistors has been illuminated by R. Penrose, A. Hodges, and others. 
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Well, Emmy Noether gives us a hint. All scattering amplitudes contain the momentum 
conservation delta function, but we learned way way back in chapter I.2 that momen- 
tum conservation, according to Noether’s theorem, encodes the translation invariance of 
spacetime. Hence, one starting point might be to plug the scattering amplitude (9) into the 
Fourier transform (11) and watch what happens. Replace the 4-dimensional delta function 
5“ (p) in (9) by its integral representation’ f d*Xe7!?** = (27)*6(p) to obtain 


M(Z_)= 10) f T] aig ets" aia] 
= fay any* fT] ai, een fi aty ea ca) 
<1 fea (6%) ‘ 


In this context, we could care less* about the factor f(A) = (12)*/((12) (23) (34) (45) (51)) 
from (9), which describes gluon scattering in detail and which the ~7,000 terms in the 
Feynman approach were desperately trying to sum up to. If we want to compare with 
experimental data on gluon scattering, we need f(A), but that’s not what we want to do. 
Instead, we want to find spacetime! 

What we have learned from (19) is that the two spinors 4 and jy, contained in each 
of the twistors Z, = (Ag, [4g), are constrained by the equality we =X?) _. The variable 
X appears as the Fourier dual of the momentum p and so quite plausibly should be 
interpreted as a spacetime coordinate. 

We have found our beloved spacetime: X is the thing that connects A and ju. 

Let’s give a simpler example to bolster our case. Consider the wave equation 07 = 0. 
The solution is given by the integral representation @(X) = [ d* p5(p) f (p)e'?*, where 
f(p) is some smooth function we don’t particularly care about in this context. That 
#(X) satisfies the wave equation is because of the delta function 6(p*): namely 876(X) = 
— f d* pp*d(p?) f (p)e'?* = 0. 

The presence of the delta function allows us to express p in terms of A and A, as in (3), 
and to write 


w= | PUPEGO, DEM = f PrP f du fcr, we 
P P 
= / dh if auf (a, W) / Phe wh — (277)? / a’) / a wf (a, w)d"(u — XA) 
P P 
= (2r)? / aaf(a, XA) (20) 
P 


(We have to mention a technical detail that doesn’t much matter for the main point we are 
trying to make: the subscript P on the integral sign indicates that we are really integrating 
over projective space due to the rescaling freedom in (5). Concretely, this simply means 
that we can set one of the components in i to 1 by scaling. This makes sense, since the 
integral over p we started with was over 3 real variables due to 5(p?).) 
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Figure 1 (a) Points in twistor space (TS) represent null 
lines in spacetime (ST). (b) A line in twistor space 
corresponds to a point in spacetime. 


These two examples indicate that, given a twistor Z = (A, w), we can define a point or 
an event in spacetime by 


wear, (21) 


The geometry of twistor space 


Spacetime has appeared, but is the solution to (21) unique? Physicists often neglect to ask 
such refined questions, but here it is crucial to bow to the mathematicians. Suppose that 
there also exists a Y satisfying u* = Y°A,. Subtracting, we obtain (X — Y)*“,, = 0, which 
tells us that the 2-by-2 matrix (X — Y) has a zero eigenvalue, and hence det(X — Y) = 0. 
This in turn tells us that the vector (X" — Y“) is lightlike or null. In other words, given a 
solution X, any point Y in spacetime null separated from X is also a solution. A point in 
twistor space (T'S in figure 1) corresponds to a null line in spacetime (ST in figure 1) going 
through points X defined by (21). See figure 1a. 

Points in twistor space thus represent null lines in spacetime. This fact realizes Penrose’s 
vision of a representation in which light rays are somehow more fundamental than 
spacetime events. Notice that if we scale the twistor Z = (A, 1) by any complex number f, 
that is, let Z — tZ, then the solution X of (21) remains unchanged. 

At this point, it is also convenient to follow the mathematicians and complexify Z, that 
is, think of Z = (A, w) as 4 complex, rather than 4 real, numbers. Then the matrix X 
will in general not be hermitean, and the corresponding X" are complex, describing a 
complexified Minkowski spacetime. (Because of the scaling freedom, Z does not actually 
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live in the 8-dimensional space C*, but in the 6-dimensional complex projective space 
denoted by C P3.) Note that since a line is characterized by two points that lie on it (call 
them X and Y), we can think of the point-to-line map from twistor space to spacetime as 
a 1-to-2 map Z —> (X,Y). 

If a point in twistor space corresponds to a null line in spacetime, what does a line 
in twistor space correspond to in spacetime? You might want to think about this before 
reading on. 

Consider the straight line in twistor space going through two points Z, and Zz. The 
two points describe two null lines in spacetime. Do they intersect? In other words, do the 
two equations 44 = XA, and zg = XAzg share an X as acommon solution? 

They do, and the common solution is given by 

ai _ MAb — GH (22) 

fa (Aaa) 
which you can verify by direct substitution. (Recall that the “metric” for spinor indices 
is antisymmetric.) Actually, this solution is essentially fixed by symmetry and scaling 
considerations. For example, from 4 = XA, and zp = XAz, we see that if we scale 
[La > ti, and pz — typ, then clearly we have X — tX. On the other hand, under 
4 — t04 and Ap > tag, we should have X —- t~!X. Also, X should be symmetric under 
A<B. 

Thus, we have a map (Z,, Zp) > Xp. Indeed, take any point Z7- =uZ, + (1—u)Zz, 
for u an arbitrary complex number, and you can easily show that X4@ = X4%. Thus, rather 
pleasingly, a line in twistor space corresponds to a point in spacetime. See figure 1b. 

To summarize, a point in twistor space corresponds to a null line in spacetime, and a 
line in twistor space corresponds to a point in spacetime. Cool, eh? 

Incidentally, (22) indicates that two complex null lines in complex Minkowski spacetime 
generically intersect. Note that this is not true of two arbitrary null lines in real Minkowski 
spacetime. 

Now that we have defined points and straight lines joining two points in twistor space, 
we can go on to study planes, triangles, polygons, tetrahedrons, polyhedrons, and more 
generally, polytopes, in direct analogy to the familiar objects in Euclidean space. In a truly 
amazing discovery, Hodges? realized that the scattering amplitudes we have been talking 
about can be interpreted as the volumes of polytopes in momentum-twistor space.* 


Appendix 1: A quick review of matrix algebra 


As promised, here I go over some concepts that you may be unfamiliar with. I hate to lose anybody who has 
gotten this far. 

The hermitean conjugate of a complex matrix M, written as MT‘, is defined to be the complex conjugate of 
its transpose, thus Mt = (M7)*. The matrix M is said to be hermitean if it is equal to its hermitean conjugate: 


* Explaining what momentum-twistor space is would take us too far beyond the scope of an introductory 
textbook on gravity. 
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M = Mt. Apply this to an arbitrary 2-by-2 complex matrix: 


uov uow\ ux w* 
= = (23) 
wz vz v* 2 
The matrix is hermitean if u and z are real, and v = w*. Thus, given four real numbers p” = (p®, p, Pp’, D>), 
the matrix in (1) is indeed the most general 2-by-2 hermitean matrix. Define the three Pauli matrices 


ey a Cre mas 0) 
A 0 i 0 0 -1 


These three matrices, together with the 2-by-2 identity matrix J (for convenience, define o=l ), form a complete 
basis in the sense that any 2-by-2 hermitean matrix can be written as (p°I — p'o') = — P,.o, with 4 real numbers 
Py = Mvp”. Our convention is such that the index on a Pauli matrix can be freely raised and lowered, for example 
07 = 07. The product of two Pauli matrices is given by 


alot = 84) + istike* (25) 
which you can verify by direct computation. Here ¢!23 = +1 denotes the antisymmetric symbol. 

Write « = (o1, o?, 0), and let Z = (z!, z?, z>) be 3 complex numbers. Verify that (z -a)* = ay j=02 
a jx0 zizd (5 + igtikok) = 2. Define |Z| = (Z”) and? = Z/|z|. (Note that |z| defined here is in general a complex 


izigig] = 


number.) Expand in the usual Taylor series the exponential L = e/7°7 = iz +o)" /n! = cos |Z| + iZ +6 sin [2|, 
where we arrived at the last step by separating the sum into two sums, one over even n, the other over odd n. 
Using this result, you can check that det L = 1. I suspect that many readers have seen this for Z a real vector. 
If you have never seen Pauli matrices before, you might wish to skip this chapter entirely at a first reading and 
come back to it later. 

Fine. Now let us go back to the transformation (2): p’ = L* pL. As explained in the text, this produces a Lorentz 
transformation of p’ into p’“, provided that det L = 1. Using the representation L = e!7°7 we see that if is real, 
the transformation is a rotation, while if Z is imaginary, the transformation is a boost. Work this out! 

We can also count. The statement det L = 1 imposes 2 real conditions on a matrix with 4 complex numbers, 
so that L is characterized by 8 — 2 = 6 real numbers. In other words, the Lie algebra of SL(2, C) has 6 generators. 
On the other hand, we know that the Lorentz transformations consist of 3 rotations and 3 boosts. Indeed, the 
discussion just given already indicates what the precise correspondence is. 

In the text, we raise spinor indices with the antisymmetric symbol* ¢° according to u* = e% u p- We wish to 
lower spinor indices with ¢,g according to pry = €,4/4%. This requires by qe = 3h and thus ¢,,¢7! = 1= —eye". 
Thus, we have to define ¢,, and e!* with opposite signs, which leads to all kinds of pesky signs when dealing 
with spinors. I would advise you not to worry too much about signs when reading this chapter.!° Typically, in 
this subject in particular, and in theoretical physics in general, the relative signs matter, but not the overall signs 
(unless you are building a bridge or something like that). 

A useful identity is 0,0,’ 0) = —o;, which you can verify by evaluating this expression for the three different 
values of i. In parallel with o“ = (1, G), define o” = (1, —c), then tr 04a” = —2n"”. Using this identity, we can 
show that the scalar product of two vectors p and q is given by 


2p -q = 6"? raga pp = —Pade ape” = tt(porg" 02) = Pydy tt(o"'G") (26) 


For q = p, we see, upon recognizing the definition of the determinant, that this reduces to p - p = e%? eh Poi Ppp 
= det p. 


Appendix 2: Inversion 


In chapter IX.9, we discussed the inversion of spacetime. Perhaps it would not surprise you that inversion comes 
out quite elegantly in the language of twistors. From (19), we learned that the spacetime coordinates X are 


* Do not confuse the two different antisymmetric symbols «'/* and ¢%8! The former carries vector indices 


i, j,k =1, 2, 3, while the latter carries spinor indices a, 6 = 1, 2. 
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determined by the relation (with the irrelevant index a suppressed) 
Mg = Xuan” (27) 
It follows* that XP“wy = XP*X,,A% = X20. We obtain 


xhe 
eS ye Me (28) 


Thus, inversion corresponds to the interchange Ap. 


Exercises 


1 For two lightlike vectors p and g, write Pog = Neder and dug = Ugity. Calculate the Lorentz scalar product 
p-qinterms of A, w, A, and ju. 


2 Fourier transforming (16), show that the 4-gluon scattering is 


4 
M(1*,2-,3*,4.) = = 
(12) (23) (34) (41) 

to lowest order. Comparing with (9), do you see a pattern? 


3 Fourier transform the cubic vertex for gravity (17) to show that 


M (1-~ ao 3tt) = (59a) (29) 
/ ‘ (12) (23) (31) 


Notes 


1. See, for example, http://online.kitp.ucsb.edu/online/qcdscat11/. For a pedagogical introduction, see chap- 
ters N.2-4 in QFT Nut. 

. Names are not so important; I give them just so that you can chat at a cocktail party. 

See, for example, QFT Nut, chapter III.8. 

. Itis derived using a recursion technique explained in, for example, QFT Nut, chapters N.2 and N.3. 

Or vice versa, as you like. 


AuAWN 


. The need for the absolute value involves an argument that we cannot go into here. Here is a cryptic explanation 
almost designed to add to your puzzlement: starting with the representation 5(x) = (27)! f dpe'?*, we can 
write formally sign(x) = 2(27)~1 f dpe'?* p and |x| = 221)! f dpe'?* p~?, facts that you can verify by 
differentiating these two integrals with respect to x. The statement is that the second integral in this sequence 
appears in Yang-Mills theory, while the third appears in Einstein gravity. I refer you to N. Arkani-Hamed, 


F. Cachazo, C. Cheung, and J. Kaplan, arXiv:0903.2110v2, for more details. 
7. As explained in endnote 4 in chapter X.2, we have 6(x) = x Jes dke~*** = limg -, 50 x ie dke7ik* = 


limg +00 a Wee dk cos(kx) = limg_,o9 sin(K x)/(ax). The representation used in the text is the 4-dimen- 


sional generalization of this representation: 6“ (x) = 5(t)5(x)8(y)8(z) = f ok eX with kx = Nyvkex? 
and a slight abuse of notation (making x represent more than one thing). 
8. And even less about the (27r)*. 
9. A. Hodges, arXiv:0905.1473; N. Arkani-Hamed, J. Bourjaily, F. Cachazo, A. Hodges, and J. Trnka, arXiv: 
1012.6030. 
10. Readers worried about signs are referred to appendix E in QFT Nut, second edition, in which the signs are 
allegedly correct. It contains additional material related to the discussion in this appendix. 


* Remember to raise and lower spinor indices with the antisymmetric symbol! (Here it is X? = ,,,X"X” 
as usual.) 


X. ] The Cosmological Constant Paradox 


The graviton knows about everything 


Gravity knows about everything, whatever its origin, luminous or dark, even the energy 
contained in fluctuating quantum fields. 

This omniscience of gravity lies at the root of the gravest, or if you prefer, one of 
the gravest, puzzles of theoretical physics, namely the cosmological constant paradox.! 
According to quantum field theory, spacetime is a boiling sea of quantum fluctuations, 
and according to Einstein, gravity should know all about this. 

Allegedly, quantum field theorists can reliably estimate the energy density of this boiling 
sea, but somehow the theoretical value they are led to disagrees enormously with observa- 
tion, and by enormously, we are not talking about a mere few orders of magnitude. In this 
chapter, we discuss how this dismal and embarrassing situation for theoretical physics 
comes about. 


The vacuum as a boiling sea of quantum fluctuations 


In chapter VI.2, I already mentioned quantum fluctuations contributing to the cosmolog- 
ical constant. Another way of expressing this is that in quantum field theory, good old 
Minkowskian spacetime is unstable. It gets driven to de Sitter spacetime. 

A full understanding of quantum fluctuations requires some acquaintance with quan- 
tum field theory, but you can readily grasp the origin of the cosmological constant paradox 
with a rudimentary knowledge of quantum mechanics. Consider the harmonic oscillator. 
Classically, a mass attached to a spring attains its lowest energy, namely 0 by definition, 
when it sits quietly at the bottom of the potential well. In quantum mechanics, however, 
due to the Heisenberg uncertainty principle, there is constant and unavoidable fluctua- 
tion in the particle’s position, and the lowest energy the particle can attain is not 0 but 
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shiw, where w denotes the (circular) frequency of the oscillator.” This irreducible amount 
of energy is known as the zero point energy. 

You probably know that the electromagnetic field can be treated as a superposition of 
waves known as modes, each with some wave vector k (defined by the inverse of the wave- 
length 4 times 27) and vibrating with frequency o(k) = clk. (Recall also chapters I1.3 
and IX.4.) Each of these modes corresponds to a harmonic oscillator.* When the electro- 
magnetic field is quantized, each mode contributes Tho(k) to the zero point energy. This 
represents the minimum amount of energy in any given mode, present even when the 
electromagnetic field is not excited, in the same way that sho represents the energy of 
the harmonic oscillator even when it is not excited. In other words, in quantum electro- 
dynamics, the electromagnetic field contributes an energy to spacetime even when there 
is no electromagnetic field present! This energy verily deserves the name vacuum energy. 

To determine the total vacuum energy, we simply sum over all modes, one for each 
value of k: thus, Ec? ee (k) ~ dvi Ik| in natural units. (To do the sum, we follow 
a standard procedure in quantum mechanics: put the system in a box of volume V and 
impose periodic boundary conditions on the electromagnetic wave. Then k becomes a 
discrete, rather than continuous, variable, so that the sum makes sense. In the limit 
V — oo, the sum 7; tends to the integral’ V f dk.) 

The end result? is that quantum fluctuations of a field contribute to the vacuum energy 
per unit volume by an amount A ~ (V {dk @(&))/V ~ aa dk k’k ~ M4. Here M, 
(traditionally known as a cutoff* in quantum field theory) expresses our threshold of 
ignorance. We are saying that we understand? the electromagnetic field up to an energy or 
momentum scale M., corresponding to some maximum value of k, beyond which we dare 
not go, so that we integrate only up to M.. Thus, M+ represents a conservative estimate. 

In summary, each quantum field* contributes M¢ to the vacuum energy density, possibly 
with different values of M, for different fields. 


A humongous discrepancy between expectation and observation 


We do not know precisely the mass scale M, at which our current understanding of 
quantum field theory starts to break down. Traditionally, people take for M, the Planck 
mass Mp ~ 10! GeV, at which quantum gravity kicks in (see the following chapter). But 
this gives a vacuum energy density of Mt = Mp/13, and we don’t have to bother to put in 
any numbers to see that this is way way off. 

Indeed, go ahead, take your best guess of what M, might be. If you are inclined to be 
conservative, you might think that Mp, all the way up in the clouds, is way too high. OK, 
how about ~1 GeV, about equal to the proton mass? Or perhaps ~5 MeV, close to the 


* As was mentioned in appendix 2 of chapter VII.3. In chapter VI.4, we also alluded to the fact that the 
electromagnetic field may be regarded as an infinite number of harmonic oscillators. 

T Note that this is dimensionally correct, since kK has dimensions of inverse length. 

+ Another way of expressing the difficulty is to say that a quantum field, such as the electromagnetic field, 
contains a very large number of oscillators, one for each k, or equivalently, one at each point in space. 
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electron mass m,? Surely, the basic principles that go into quantum field theory have been 
verified experimentally up to that kind of energy scale. You then predict a vacuum energy 
density of m4 = m,/(1/m,)°. 

You don’t even have to bother looking up the observational data. Just look around you. 
Even with M, as low as ~ m,, you are still way way off. Do you see the vacuum filled with 
one m,c? worth of energy in a volume the size of the Compton wavelength of the electron? 

By any measure, this is the Mother of all Discrepancies between what the theorists think 
and what the experimentalists observe. Our theoretical expectation is not the result of some 
crummy calculations based on somebody’s pennyworth model. In fact, forget field theory, 
all we need is good old dimensional analysis. In natural units, energy density has dimen- 
sion of mass to the fourth power. The only natural mass associated with gravity is the Planck 
mass, but whatever smaller mass we put in, even m,, we still get an unacceptably large 
energy density. This nasty discrepancy is known as the cosmological constant paradox. 

Rightly or wrongly, I presumed in chapter VI.2 that the observed dark energy is the 
fabled cosmological constant. The evidence seems increasingly to favor this simplest 
of hypotheses. Even if this were not the case, the paradox still remains. Why is the 
contribution of quantum fields to the vacuum energy so small? 

Instead of giving you the observational value of the dark energy density A in some units 
such as pounds per parsec cubed, I find it more convenient to define the mass scale M, 
according to A = M,*. Observationally, the mass scale associated with the dark energy 
density comes out to be M, ~ 107? eV. Expressing the observational data in this way 
shows clearly how humongous the discrepancy is. Even if we take M, to be as small as the 
electron mass, the ratio between theoretical expectations and experimental reality would 
be ~($10°/10-3)4 ~ 10%. 

Another way of expressing the cosmological constant paradox is that M, is much smaller 
than anything that is considered reasonable in particle physics. The observation of dark 
energy appears to suggest that there is a hitherto unknown mass scale of ~10~ eV in 
physics. Here is a curious fact. In the late 1990s (strangely, around the same time dark 
energy was discovered), neutrinos, which up until that time were thought to be massless, 
were experimentally found to oscillate, which implies, according to standard particle 
theory, that they are massive. Since there are 3 kinds of neutrinos, their masses, which 
have not yet been completely nailed down experimentally, can span quite a range. But they 
appear to have generic values, very roughly, of order 107? eV. Is this pure coincidence?® 
In any case, there might be some physics we have yet to understand at a mass scale of 
~1073 eV. 


The largest and the smallest masses 


I also find it convenient to express M, by repackaging a remark from appendix 1 of 
chapter X.3. Define My =1/Lyniverse. With Luniverse the size of the universe, say the 
Hubble radius, as some sort of Compton mass of the universe. Then (X.3.7) becomes 
My ~ \/MpMy. With Mp ~ 101° GeV and My ~ 2 x 1073 eV, we find M, ~ 4 x 10-3 eV, 
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which is of course just the statement that the dark energy almost singlehandedly closes the 
universe. Well, Mp is the largest mass considered in fundamental physics, and surely, My 
is the smallest, and so interestingly, M, is the geometric mean between the largest and 
the smallest, as already remarked on in chapter X.3 but in terms of length scales. (By the 
way, between us friends, when I say “largest” and “smallest,” you know what I am talking 
about. To the nitpickers: yes, I know about oo and 0.) 

I define A as an energy density by writing the Einstein-Hilbert action as [ d+x./—g(A + 
eR). Trivially, we can also regard it as a sort of curvature by writing the action as 
f dt*x./=g aA + R). Then d is given by the inverse square of some length, call it L,. 
Again, observationally, we know that the two terms in the action have comparable weight, 
and hence the length scale associated with the cosmological constant is on the order 
of the size of the universe. In other words, the radius of curvature associated with the 
cosmological constant is given by L, = Mp/M% ~ 1/My ~ Luniverse- 

For the record, let us also restore the 3 fundamental constants. Then we have A ~ 
M4 /h = (Mqc*)/(i/M,c) and the curvature of the universe Li... ~ Ge°M¢ /h. We 
note in passing that although all three fundamental constants appear here, it is not clear, in 
spite of the cube of physics mentioned in the introduction to this book, whether quantum 
gravity is essential in unraveling the cosmological constant paradox. At least naively, gravity 
appears to be merely acting as a probe. (Recall that analogous remarks were made in 
connection with Hawking radiation in chapter VII.3.) We would of course prefer to think 
that the cosmological constant paradox and Hawking radiation will eventually prove to be 
indispensable keys for unlocking the mystery of quantum gravity. 


Dead as a door nail 


The cosmological constant paradox has been with us for a long time. To the best of my 
knowledge, Pauli was the first to worry about the gravitational effect of the zero point energy 
filling space. He used for M, the inverse of the classical radius of the electron and concluded 
that the resulting universe could not even reach to the moon!* Many of the greats of 
quantum physics were also skeptical of the zero point energy. At the 1913 Solvay Congress, 
Einstein declared that he did not believe in the zero point energy, writing to Ehrenfest that 
the concept was dead as a door nail. However, the experiment y + H, ~ H + H convinced 
Pauli and others. For energy to be conserved, 5/iw has to be included in the energy of the 
H, nucleus. 

At present, one could hardly doubt the reality of the zero point energy. Theoretically, 
it comes directly from the Heisenberg uncertainty principle. Experimentally, the liquidity 
of helium at zero temperature provides direct evidence, according to standard textbooks. 
People also often cite the Casimir effect,” namely the force between two conducting plates 
generated by quantum fluctuations, as showing that the vacuum energy is perfectly real. 


* Surely, for Pauli, the zero point energy sho was in the category of beautifully and intriguingly wrong, way 
beyond the infamous category of “not even wrong.” 
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One word of caution, however. Experiments on the Casimir effect measure the force, that 
is, the variation of the vacuum energy contained between the two plates as we vary the 
separation between them.’ 

With the passage of time, people found better things to worry about, and the issue was 
forgotten until Y. B. Zel’dovich raised it again in the late 1960s. I would say that general 
awareness that a paradox was indeed lurking did not occur till the 1970s, particularly in 
the West. (One reason was that particle theorists in the United States by and large did not 
worry” about gravity and cosmology until the publication of Weinberg’s influential books.) 
Until the observation of dark energy in the late 1990s, there was only an upper bound to the 
vacuum energy density. Since in natural units, this upper bound is on the order* of 1071 
in natural units, particle theorists generally declared that, for some unknown reason, the 
cosmological constant is mathematically zero. (An ultimate example of proof by authority!) 
For decades, many pinned their hopes first on supersymmetry, then on supergravity, and 
finally on superstrings. Unfortunately, nobody was able to produce a compelling argument 
for A=0. 

The cosmological constant paradox may thus be summarized as follows. In some 
suitable units, the cosmological constant was expected to have the value ~10!”?. This is 
so huge that it was decreed to be zero identically, while the measured value (here the 
presumption that the dark energy is the cosmological constant comes in) turned out to 
be ~1. 

Incidentally, while A was decreed to be identically zero by particle theorists, it was never 
banished by observational cosmologists, who needed it to reconcile various discrepancies 
in the data (for example, a universe younger than the earth due to an erroneous value 
of the Hubble constant in the 1930s and the clustering of the redshift data of quasars in 
the 1960s). This contrarian, but data based, point of view was particularly championed by 
P. J. E. Peebles. 


Naturalness 


In discussing the cosmological constant paradox, I should mention briefly the naturalness 
dogma or doctrine in high energy theory, as was alluded to in passing in chapter X.3. It is 
sometimes said jokingly that there are only two dimensionless numbers in fundamental 
physics: 1 and 0 (oo being of course the inverse of 0). Again, between friends, the symbol 
1 is understood to encompass numbers like 27. In other words, if you choose units 
appropriately, physical quantities should have the magnitude you would reasonably expect. 
If a dimensionless number is exceptionally small, you should have an explanation for it. 
(One of my favorite examples is the ratio of the speed of sound in metals to the speed 
of light c,/c. Solid state theory explains why this number is small: it is composed of the 
electromagnetic coupling « ~ 1/137 and the ratio of the electron mass to the proton mass 
m,/m, ~ 10~%. See also appendix 3.) Indeed, this naturalness doctrine is what makes the 


* Since the discrepancy is so large, it hardly matters what nominal number I put here. 
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art of dimensional analysis possible. Stated thus, the naturalness doctrine sounds rather 
plausible, and, duh, perhaps even natural. 

In high energy theory, the naturalness doctrine was sharpened and forcefully articulated 
by ’t Hooft. The statement is that if a dimensionless number ¢ is unexpectedly small, then 
a new symmetry ought to emerge when ¢ = 0. (An example is the electron mass: ifm, = 0, 
then a continuous chiral symmetry appears.) In the case of the cosmological constant, the 
natural candidate symmetry is scale invariance. Unfortunately, scale invariance excludes 
not only the cosmological constant, but also the Einstein-Hilbert action by the very fact that 
Mp sets a mass scale. Furthermore, quarks and leptons are not massless (but acquire mass 
through the Higgs field). Incidentally, this is intimately connected with the remark in chap- 
ter X.3 about terms with mass dimensions less than 4. Over the decades, theorists have 
searched in vain, as I said, for a symmetry principle that would guarantee A = 0. The dis- 
covery that A is small but not zero complicates the situation further, as mentioned earlier. 


The extreme ultra infrared 


Particle physicists, who also call themselves high energy physicists, readily profess igno- 
rance about physics at high energies and short distances, namely the ultraviolet regime, 
and so ask for ever more energetic accelerators. But they generally claim that they un- 
derstand physics at low energies and long distances, namely the infrared regime, at least 
in principle and in broad outline. The cosmological constant paradox indicates that there 
may be a serious flaw in this view. Truth be told, we know almost nothing about physics 
in what we may call the extreme ultra infrared, namely physics on cosmological distance 
scales. One plausible approach to the cosmological constant paradox is that somehow in 
the extreme ultra infrared, which we may define as corresponding to distances beyond the 
galactic scale, gravity responds to vacuum energy differently. 

The most naive approach is to soften the contribution of the vacuum energy to the right 
hand side of Einstein’s equation R“” — 5 g!"R = 8n GT" by acting with some differential 
operator f(L?D?) on T“”, where D denotes the covariant derivative and L some cosmo- 
logical length scale. The right hand side is effectively multiplied by f(L?/ Lncioiease) 
where Lyhenomenon denotes the length scale of the phenomenon under study. The strategy 
is then to require f to have the properties f(~oo) = 1 (to retain the success of the so- 
lar system tests and so forth, for Lyhenomenon < L) and f(~0) = 0 (to switch off gravity’s 
awareness of the vacuum energy, for Lphenomenon > L)- 

Needless to say, the various proposals that have been discussed in the literature are all 
rather ad hoc, arbitrary, and unattractive to varying degrees, particularly given the elegant 
structure of Einstein gravity. A differential operator of the form f(L7D7) would almost 
invariably imply that the resulting equation is highly nonlocal. Furthermore, equations 
of this type tend not to be derivable from any reasonable action principle and are to be 
regarded as phenomenological rather than fundamental. 

Another approach is to add nonlocal terms directly to the action. We already briefly 
discussed one such proposal for nonlocal cosmology in chapter X.3. If the goal is to merely 
fit observation, then we can certainly craft an action that would do the job. 
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The discussion of effective field theory in chapter X.3 also makes clear the need for 
nonlocal terms. Local terms with mass dimensions higher than the Einstein-Hilbert term 
are important only at short spacetime distances. As for local terms with mass dimensions 
lower than the Einstein-Hilbert term, there is only the cosmological constant term. If we 
insist on locality, then to have any cosmological impact, we are squeezed, so to speak, 
between the Einstein-Hilbert term and the cosmological constant term. 

Another possibility is to violate some cherished principles, such as Lorentz invariance. 
We should keep an open mind, as we are dealing with almost unfathomably large distances 
in space and time here. In the appendices, we mention this and other possibilities. 


The coincidence problem and inflation 


The cosmological constant paradox is made even more mysterious by the cosmic coinci- 
dence problem. As explained in chapters VIII.1 and VIII.2, the energy density o in matter 
varies with the scale factor a of the expanding universe like 1/a?, while the energy density 
in the cosmological constant varies like 1/a°. It is remarkable that they are comparable 
now. Why now? 

Inflation adds to the mystery. As explained in chapter VIII.4, inflation is essentially 
driven by a vacuum energy, which amounts to an effective cosmological constant. How 
is it that after the universe exits from inflation, the vacuum energy manages, not to turn 
itself off, but to shrink to an infinitesimal shadow of its former self? Theorists speaking 
of both inflation and of the cosmological constant may be exhibiting a severe case of the 
“wanting the cake and eating it too” syndrome. 

The only plausible “explanation” is the anthropic principle, or if you prefer, “anthropic 
lack of principle,” as some physicists call it. The anthropic principle states that physics 
must be consistent with the existence of physicists (which you can define in whatever way 
you like). 

A strong version states that there are certain physical phenomena that physicists will 
not be able to explain using (what most physicists would agree as) the traditional approach 
of physics. The bold claim is that the cosmological constant paradox is one of them. Of 
course, one could legalistically take apart every word in the statement of the principle just 
given. For instance, what do you mean by “will not”? Is the implied time scale forever and 
forever, or is it merely until the advocate of the strong anthropic principle ceases to exist? A 
priori, how do we know which phenomena fall into the category of inexplicable by physics 
as we know it? 

A weak version of the anthropic principle states that the goal of physics is to correlate 
observed phenomena (such as cannonballs falling from the Tower of Pisa and the preces- 
sion of the perihelion of Mercury) and that to the list of observed phenomena, we should 
add the existence of humans. Certainly, most people would not object to this version. For 
instance, we could use it to calculate the distance of our planet from its sun, given various 
inputs about the properties of the sun, the temperature range in which biochemical pro- 
cesses can operate, and so on. The calculation yields only an upper and a lower bound on 
the distance, but it is a calculation nonetheless. 
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I will discuss the anthropic principle further in appendix 9, but for now, I mention that, 
in some sense, the smallness of A was predicted by Weinberg using a very weak version 
of the anthropic principle. This very weak version of the anthropic principle should be 
acceptable to most theoretical physicists (certainly to me, for what it’s worth!): it merely 
correlates two observations, namely that galaxies formed and A is very small. If A were 
larger than a certain critical value, which turns out to be not much larger than the observed 
value, galaxies would not have formed. You are then free to extend Weinberg’s reasoning 
to say that had galaxies not formed, then humans would not exist. 


Linkage between the infrared and the ultraviolet 


Quantum field theorists speak of ultraviolet (that is, high energy) physics versus infrared 
(that is, low energy) physics. Typically, in calculating a Feynman diagram, one encounters 
an integral of the form [ d‘kf (k), and if the dominant contribution comes from the large 
k (that is, high momentum or high frequency) region, we say that the relevant physics is 
ultraviolet, or UV for short. Contrariwise, if the dominant contribution comes from the 
small k (low momentum or low frequency) region, we say that the relevant physics is in- 
frared, or IR for short. The quantum field theoretic prediction we had for the cosmological 
constant, A~ f d*k~ [ Me dkk3 ~ M+, is manifestly a UV effect: the dominant region 
comes from the region k ~ M,, from quantum fluctuations with momentum comparable 
to the cutoff. (Although we obtain our prediction using a handwaving argument about os- 
cillators, a calculation using Feynman diagrams gives essentially the same result, which, 
after all, is basically fixed by dimensional analysis.) If we follow tradition and take M, to 
be Mp, the physics underlying the cosmological constant is about as UV as it can be. 

One fundamental feature of quantum field theory is that the physics at different energy 
scales naturally segregate themselves,!° speaking very roughly. The general belief is that 
if we are studying the UV, we don’t have to worry about the IR. And vice versa: if we are 
studying the IR, we don’t have to worry about the UV. 

The cosmological constant paradox appears to be the first exception to this general 
picture. Although the cosmological constant is generated by UV physics, it controls the 
expansion of the universe, which is definitely an IR phenomenon, indeed, what we called 
the extreme ultra infrared regime. 


What is vacuum energy? 


I prefer to banish speculations to the appendices. (My list of speculations on the cosmo- 
logical constant paradox is far from complete and heavily biased toward what I know. The 
appendices are meant to give you a flavor of the sort of things that have been considered.) 
Instead, I end this chapter with a few general remarks. We think that we know how to 
calculate the vacuum energy using quantum field theory, following established rules. The 
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cosmological constant paradox, however, indicates that we will ultimately have to face up 
to the question, “What is vacuum energy?” 

The question reminds me of an earlier question: “What is heat?” (Or perhaps also, “What 
is the ether>?”) Everybody but physicists knew what heat was, but it proved to be extremely 
elusive to define. It was not a substance or a fluid (the caloric), as was once thought. The true 
answer had to wait until the nature of matter was understood. As you and I know, physics 
progresses through asking first the what, then the how, and finally the why questions, 
for example, “What is matter made of?” “How do these atoms behave?” “Why are there 
atoms?” Now we know the answers to all three questions, but the why question had to wait 
until the advent of baryogenesis and leptogenesis (as was discussed in chapter VIII.3), and 
skeptics certainly still abound. At the least, we don’t know the detailed answer. 

It would be a bit disappointing if dark energy or the cosmological constant proves 
to be merely due to some mundane mechanism, such as the presence of dime-a-dozen 
scalar fields (see chapter VIII.4), which ultimately have to be fine-tuned. I think that most 
theoretical physicists would hope that the cosmological constant paradox, like the great 
paradoxes of the late 19th century, will lead us to a deeper understanding of physics. 

The universe says to the quantum field theorist, “I am doing just fine, thank you, but 
something is wrong with your understanding of the vacuum energy, or your understanding 
of how the gravitational field responds to the vacuum energy.” 

A distinguished colleague said to me recently, “The cosmological constant paradox is 
more than a paradox; it’s a profound public humiliation of theoretical physicists.” 


Appendix 1: Scaling at cosmological distances 


The history of physics is full of examples of reasoning by analogy that turn out to be fruitful. As explained in the 
text, the cosmological constant paradox can be summarized as follows: 


expected value enormous a 
decreed value mathematically 0 
observed value tiny but not 0 


Have we ever encountered something similar? I proposed long ago that the story of proton decay may provide 
such an analogy.'! 

I will not go into the particle physics behind proton decay here. Suffice it to say that, at one time, the expected 
value of the proton decay rate was enormous, then it was decreed to be mathematically 0, while the observed 
value* turns out to be extremely tiny but nonzero. The important question is how theorists managed to reduce 
the enormous expected value down to the extremely tiny observed value. 


* I am fudging slightly here: at the moment, we only have an upper bound for the observed value. Experi- 
mentalists have yet to observe proton decay, but that unfortunate fact might merely be due to the fact that the 
detectors constructed thus far are too small. As I explained in chapter VIII.3, theorists have compelling reason 
to believe that the proton does decay, so we can easily imagine that experimentalists in some other civilization 
were not as unlucky and had observed proton decay soon after grand unified theory was proposed. In any case, 
the particular details of how particle physics evolved in our civilization do not concern us here. 
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The secret is scaling (as embodied in the renormalization group ideas in quantum field theory developed by 
M. Gell-Mann and F. Low, K. Wilson, and others). As was explained in chapter VIII.3, the physics responsible for 
proton decay originates at some grand unified distance scale, while the actual decay occurs on a distance scale 
on the order of the proton size. In going from one distance scale to the other, we traverse more than 16 orders 
of magnitude, which suffices to reduce the expected value enormously to the appropriate level. 

Might we try the same trick!” and scale!} the cosmological constant term to make it less relevant at large 
distances compared with the Einstein-Hilbert curvature term? 

In one particular scheme, we have to require Einstein gravity to start to deviate from Lorentz invariance 
beyond a length scale Lip ~ 1-10° kpc, on the order of the galactic or cluster scale. It is then possible to scale 

Lin 


the cosmological constant by a factor 2G ~ (104-107)?-!, where we take the extreme ultra infrared 
wl 


length scale Lgyry ~ 10* Mpc to be the size of the visible universe. Here z measures the deviation from Lorentz 
invariance and corresponds to what is called the dynamical exponent in condensed matter physics.'* To screen 
the cosmological constant to the desired value, we need z ~ 20-30, which is at least not outrageously large. 

There are many serious difficulties with this picture; the interested reader is referred to the literature for 
details.!° For one thing, the resulting action is nonlocal in time at cosmological distances. Perhaps an optimist 
would think that this could provide a hint about the nature of time. For another, while we may be able to scale 
the vacuum energy away at cosmological distances, the vacuum energy can still make its effects felt over smaller 
regions. As one possible speculation, we can imagine each local region of the universe trying to expand and 
pressing against other regions in “rebellious symphony,” perhaps something like a cluster of soap bubbles. 

Einstein curved spacetime. Here we are suggesting that the logical next step might be to endow spacetime 
with some “substance,” such as would be the case in some kind of foamy picture of emergent spacetime. 


Appendix 2: The universe is secretly acausal, but only the universe 
knows about it 


Arkani-Hamed et al.!° have proposed modifying Einstein’s equation to 
2 72 
M> (Ruy - 18) _ 3M SuvR = Thy (2) 


where R denotes the spacetime averaged scalar curvature R = f d*x,./—gR/f d*x,/—g. This equation is man- 
ifestly nonlocal and acausal: physics now depends not only on what happened in the past but also on what will 
happen in the far future. But by construction, the modification to Einstein’s equation takes effect only ifthe future 
is de Sitter with constant scalar curvature determined by the cosmological constant R=-4A/(M, 3 + M?). To ac- 
count for observation, the new mass scale M has to be huge, taking values ranging from ~10** GeV to ~10® GeV, 
depending on the assumed value of the cosmological constant one wishes to “neutralize.” Unhappily, another 
enormous mass scale has to be introduced into physics. 

In this approach, the modification is clearly designed not to matter for any situation other than cosmological. 
For the solar system, for example, R would come out to be practically zero. The universe is secretly acausal but 
only the universe knows about it! I must say that in recent years, theoretical physicists have become increasingly 
adept at hiding new physics from experimentalists. 

Arkani-Hamed et al. argue that any mechanism to neutralize the cosmological constant must be acausal: 
when a vacuum energy density turns on, the alleged mechanism must wait for a cosmological time period to 
find out whether the energy density is indeed a cosmological constant. I am very much troubled by the thought 
that physics may be ultimately nonlocal, even if it is only on the cosmological scale. 


Appendix 3: Possibility of an algebraic solution 


Another possibly relevant historical analogy involves the inverse light speed ¢ = c~!. Consider the expected value 
of ¢, before it was measured, say, in some civilization in a galaxy far far away. The expected value is enormous 
in natural units, if propagation in the ether is assumed to be similar to sound waves in ordinary materials, let 
alone ocean waves. By the naturalness dogma, we might have expected ¢ to be comparable to f,oung. Just as in the 
cosmological constant paradox, we can see that this is way off merely by looking around us. Evidently, ¢ << f,ouna- 
Given this, physicists would have been tempted to decree (proof by authority) that ¢ is mathematically 0. But 
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eventually, it was observed by the extragalactic version of Ole Romer!” that the observed value turned out to be 
tiny but not 0 (as both Galileo and Newton had thought). In this case, the naturalness dogma would have been 
off by a measly 6 orders of magnitude or so. 

How was this ¢ paradox resolved? It was resolved by making c part of the kinematics. We went from the 
Galilean to the Lorentz group, and c became a conversion factor between space and time. The unification of 
space and time into spacetime allows us to chose units in which c = 1, a value protected by Lorentz invariance. 
In other words, it does not get renormalized! (In contrast, in nonrelativistic theories, c would get renormalized.) 
Quantum fluctuations do not affect ¢ = c~1, thanks to its being part of an algebra. 

Does this analogy tell us anything? To solve the ¢ paradox, we had to go from the Galilean group to the 
Lorentz group. Perhaps we need to go one step further and extend the Lorentz group to the de Sitter group! The 
cosmological constant A, like c before it, would then become a fundamental constant of nature. Just as c is a 
fixed constant in the Lorentz algebra, A then becomes a fixed constant in the de Sitter algebra. In this sense, the 
question of why the cosmological constant is so small compared to what the naturalness dogma would lead us 
to expect might eventually turn out to be the wrong question to ask, or at least the wrong way of phrasing the 
question. 

Another analogy might be illuminating. Imagine a civilization on a very large planet, much larger than 
our own. Physicists in this civilization could have developed physics to a high level of sophistication without 
realizing that their world was actually round. The symmetry group of physics was found to be the Euclidean 
group, consisting of two translations and a rotation about the vertical axis, generated by P, = 0,, Py = 9), 
and J = xd, — yd,. But technology kept advancing, and with the development of powerful binoculars, a new 
phenomenon was discovered: ships going out to sea did not simply become smaller and smaller, but vanished 
over the horizon. The rate was eventually measured to be tiny but definitely not zero, as leading theorists had 
decreed. But all the efforts theorists put in trying to calculate this rate from known physics was in vain. 

Later (who knows how much later), it was realized that the invariance group of physics was not the Euclidean 
group, but the rotation group SO(3), generated by J, = yd, — 20,, Jy = 20, — xd,, and J, = xd, — yd,. The 
Euclidean group, previously held to be “sacred,” turned out to be generated by P, ~ (zd, — xd,)/R, Py ~ 
—(yd, — 20,)/R, and J = J, = xd, — yd, in the limit z ~ R ~ 00, where the very large length R was revealed 
to be the radius of the planet. Furthermore, R was not renormalized by quantum fluctuations. 

If this analogy contains some elements of truth, then it also suggests, like the previous analogy, that the 
cosmological constant should be built into the invariance group of physics. It is then perfectly understandable 
that our continuing struggle to calculate the cosmological constant would fail. I have in mind a formulation of 
gravity based on the de Sitter group, not the study of Einstein gravity in a de Sitter spacetime, much as Einstein 
gravity is a formulation of gravity based on the Lorentz group, not the study of Newtonian gravity in some 
Minkowskian setting. 


Appendix 4: Unimodular gravity 


In chapter VI.2, I mentioned that we can always sneak an additive constant into the Lagrangian, but only when 
gravity is not around. Since gravity knows about this additive constant through ./—g, perhaps we can solve 
the cosmological constant paradox by nailing g down, not allowing it to vary. The result is known as unimodular 
gravity.!8 The ugly part of the proposal is that we are then no longer allowed to make any coordinate transformation 
x — x’(x) that we please, but only those that preserve g. 

Fixing g to be equal to —1 would seem to render the cosmological constant term impotent and hence irrelevant. 
But in fact, it comes back! 

To obtain Einstein’s field equation, we varied the action in chapter VI.5 and used the identity 5 f d+x,/—gR = 
— f dtx./=g(RY — 58! Rb uy. Now we are told that we cannot vary arbitrarily, but consider only those 
variations 5g,,, that do not change the determinant and hence satisfy the constraint 5g = 0, which (with the 
use of an identity derived back in chapter V.6) works out to be g/g, = 0; in other words, 5g,,, is traceless. 
Let us split the Einstein tensor R“” — 38"R into a traceless part and a traceful part: (R“” — $g"R) — tPR. 
When multiplied by 5g,,, in the variation of the action, the traceful part drops out. Thus, we only get the traceless 
part of Einstein’s equation 


Ruy _ t8vR = Thy _ t8yvl (3) 


To see that the cosmological constant comes back, write the — } on the left hand side of (3) as —} + } and 
covariantly differentiate. Using the fact that the covariant derivative of the Einstein tensor and of the energy 
momentum tensor both vanish, we obtain 0,R =—9,T, which can be solved to give R = —T + C. Insert this 


back into (3) and watch the integration constant C reappear as the cosmological constant. 
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Thus, unimodular gravity does not solve the problem but makes some people “feel more comfortable,” because 
in theoretical physics, we have the license, supposedly, to set integration constants to whatever we want. 

Instead of nailing g down, we might consider a softer constraint by including in the action a term like 
f d*xV(g), where the function V(g) has a deep minimum at g = —1 (for example, V(g) = A(g + 1)?, with A 
large). This would serve to encourage g to stay close to —1. Again, we would no longer be allowed to make just 
any coordinate transformation we feel like. 


Appendix 5: Decaying cosmological constant 


Over the years, many physicists have had many (“crazy”) thoughts about gravity. One possibility, once considered 
highly speculative, was to entertain a decaying cosmological constant!? iA # 0. But these days, with a multitude 
of scalar fields”? around, rolling down this hill or that, this possibility would be considered commonplace rather 
than outrageous. 

A dynamical realization of this might be to have the vacuum energy in de Sitter spacetime dissipate by 
producing particle antiparticle pairs. The mechanism would be similar to that involved in Hawking radiation. 
However, an order of magnitude estimate would seem to suggest that the effect is far too small. 


Appendix 6: Breaking free of local field theory 


The cosmological constant paradox suggests to some people that we might have to break free of local field 
theory entirely. One possibility is to add terms not of the form f d*+x(---) to the action, in a vaguely Landau- 
Ginzburg sort of approach to the action.?! Interestingly, without too much contortion, we could obtain the relation 
My ~ /MpMy already mentioned in the text. 


Appendix 7: Lagrange multiplier for the volume of spacetime 


Here is a remark I find intriguing. What is the cosmological constant A? It is the Lagrange multiplier for the 
volume f d*x,/—g of spacetime, whatever that means: Scosmological = —{ 4x /—8A =—A f d*x/=B. 

In statistical or thermal physics, the Lagrange multiplier for the volume f‘ d3x of the system—picture a balloon 
filled with gas—has a name: the pressure P. We certainly understand the concept of pressure well. Itis also under 
the experimentalist’s control. But here the universe is not some container pushing into some external space, at 
least not in the standard view.”* 


Appendix 8: Deleting Feynman diagrams and the equivalence principle 


This appendix is for those readers with some familiarity with Feynman diagrams. Let us consider the diagram 
responsible for the cosmological constant. Start with a matter field (for example, an electron field) loop. This 
diagram describes the vacuum fluctuation of the electron field. An electron and a positron pop out of the vacuum, 
propagate, and then the two annihilate each other. As explained in chapter VII.3, this goes on all the time. 
Now a graviton wanders by and couples to the electron line: the graviton is sampling the energy generated by 
this particular vacuum fluctuation. Ultimately, it is this diagram that causes all our hand-wringing over the 
cosmological constant. Suppose you were to work long and hard and come up with a rule or theory that cleverly 
deletes this diagram, thus solving the cosmological constant paradox. As emphasized by J. Polchinski, any such 
rule or theory would always be doomed to fail because of the equivalence principle. 

The argument is as follows. Connect the diagram by, say, two photon lines to the propagator of some atomic 
nucleus, say, aluminum or iron. This diagram thus contributes to the gravitational mass of the nucleus. On the 
other hand, consider the same diagram with the atomic nucleus but with the graviton removed, a diagram that 
presumably has nothing to do with gravity. But this diagram contributes to the inertial mass of the nucleus. 


X.7. The Cosmological Constant Paradox | 757 


Thus, with the enormous accuracy to which the equivalence principle has been tested, we already know that 
the diagram with the graviton attached cannot be deleted. But we are claiming that, to resolve the cosmological 
constant paradox, we have some rule to delete this diagram. Basically, Einstein gravity is so tightly constructed 
that we cannot easily bend the rules without upsetting something else. 

The trouble is once again that physics, as we understand it, should be local: at the point where the graviton 
couples to the electron, how can the graviton “know” what the electron loop is going to do? It cannot know 
whether the electron is just going to loop back upon itself, or that before looping back, the electron is “planning” 
to emit two photons, which subsequently will be absorbed by a nucleus. 

The local nature of Feynman diagrams, plus the constraint from the experimental verification of the equiva- 
lence principle, make it difficult to imagine how any rule could be invented to delete one Feynman diagram and 
not another. Perhaps one loophole is offered by the phrase “nothing to do with gravity”; perhaps even a diagram 
without the graviton is subject to the requirements of some ultimate theory of gravity. 


Appendix 9: Argument using the anthropic principle 


Here I repeat, and elaborate on, some of the remarks made in the text regarding the anthropic principle. The 
anthropic principle states that the laws of physics must be consistent with the fact that there are physicists around 
to discuss them. Opinions on this principle differ enormously, and I do not wish to go into this raging controversy 
here. Suffice it to say that many find it vaguely distasteful, perhaps even unprincipled. At one level, the statement 
is almost trivially true, something of a tautology. 

The goal of physics is to relate apparently disjoint phenomena, for example the moon orbiting the earth and 
the apple falling. One of the great triumphs of physics is to relate these two particular observations. In the text, I 
mentioned that Weinberg showed that the very fact that galaxies formed allowed him to put an upper bound on 
A: if A were too large, the universe would have expanded too fast for galaxies to have formed. (See chapter VIII.3.) 
In fact, the observed value is not too far from this upper bound. 

Put this way, I don’t see how the statement can be objectionable: physics relates two different phenomena. 
(But notice that a theoretical framework is needed, namely the expansion of the universe and a scenario for how 
galaxies came to be.) Similarly, if we take the earlier statement and replace in it the phrase “galaxies formed” 
by “humans exist,” the resulting statement, namely that the very fact that humans exist allowed an anthropic 
theorist to put an upper bound on A, is hardly more objectionable. This is no different from using the fact that 
humans exist on this particular planet to set bounds on how far we live from our sun. 

One important aspect of the anthropic principle is that to even entertain this principle, one has to be able 
to conceive of different universes with different laws of physics. This is why, although the principle originated 
during the mid-20th century in the study of nucleosynthesis in stars, it did not come into bloom until the advent 
of gauge theories of the strong, weak, and electromagnetic interactions. With gauge theories, one can conceive 
of changing the gauge group and the parameters contained in the theories. In this sense, string theory appears 
to support, or at least to permit, the anthropic principle. 

In the early days, string theory faithfuls hoped that they would be led to a unique ground state, so that all 
fundamental constants of physics, including the gauge coupling strengths, the quark and lepton masses, the 
cosmological constant, and so on and so forth (of course what I mean here are the dimensionless ratios formed 
out of these quantities) would be uniquely fixed. This hope has not been realized, to say the least. In fact, almost 
the exact opposite has occurred. At last count, string theory is said to have 10° (the precise number hardly 
matters to the innocent bystander) possible ground states, each corresponding to a universe with a different 
set of laws of physics. The only way we know to choose between this plethora of ground states is, allegedly, the 
anthropic principle. Indeed, string theorists have turned this apparent defect of the theory, its inability to predict 
a unique ground state, into a virtue: it is only with this vast wealth of ground states that we can “understand” 
anthropically why the cosmological constant is so tiny. 

My distaste of the anthropic principle—or, at least, discomfort with it—is that it provides a disincentive to 
theoretical physicists to search for explanations in the traditional sense of the word. At one time, some people 
invoked the anthropic principle to explain why the neutron is more massive than the proton, even though naive 
reasoning”? involving the energy contained in the electromagnetic field surrounding the proton would have 
predicted the precise opposite. The argument was that, if the neutron were less massive, the proton would have 
decayed into the neutron, rather than the other way around (through the process n > p + e~ + b, mentioned in 
chapter VIII.3). The hydrogen atom would disintegrate, and there would be no physicists around. (This argument 
is hardly watertight, since other nuclei could still be stable, and who is to say that we could not build physicists 
without hydrogen.) While we still do not understand exactly why the neutron is more massive than the proton, we 
have at least pushed physics to a deeper level and reduced the question to why the down quark is more massive 
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than the up quark. Similarly, the use of the anthropic principle in the context of stellar nucleosynthesis (roughly, 


if an excited level in some nucleus did not exist, nucleosynthesis could not have proceeded at the rate that it in 
fact does) might have discouraged the development of nuclear theory. I trust that nuclear theory can be developed 
to the point that the existence of this level could be understood, at least in principle. It is fair to say that most 
physicists, if presented with an anthropic and a traditional explanation for a given phenomenon, would probably 


choose the latter. In an ideal world, in a galaxy far far away, perhaps universities could afford to have separate 
departments of physics and of anthropic studies. 


Notes 


. This chapter is adapted from A. Zee, “Gravity and Its Mysteries: Some Thoughts & Speculations,” in 
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World Scientific, 2008. The gist of the story outlined here was told over three birthday parties: Dirac’s 80th 
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. For this argument, we need to invoke merely free field theory: we are just adding up zero point energies of 


harmonic oscillators. We don’t even have to learn how the electromagnetic field is coupled to the electron 
field and all the other charged fields. Or do we? 
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Mgw/M, where Mpw ~ 103 GeVis the scale of the electroweak interaction. The left hand side is ~101°/103 = 
101°, while the right hand side is ~103 GeV/10~3 eV = 1055. Of course, we would be off by 3 orders of 
magnitude if we took Mgw ~ 102 GeV. The actual scale, which is a somewhat loosely defined concept anyway, 
is perhaps about 300 GeV. 


. For a simple explanation, see QFT Nut, p. 70. The discussion also provides a beautiful realization of the 


concept of cutoff in quantum field theory. 


. In quantum mechanics, experiments typically measure only energy differences AE and not the energies 


themselves. The Casimir effect is a case in point: it does not measure the vacuum energy itself (as is 
sometimes erroneously stated). 
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Once, when I was lecturing in Copenhagen and talking about this analogy, I looked up and saw a picture of 
Romer looking down at me. 

The notion goes back to Einstein in some sense, and was developed later by H. van der Bij, H. van Dam, Y. J. 
Ng, F. Wilczek, A. Zee, A. Dolgov, S. Weinberg, and many others. 

I was inspired by Dirac’s large number hypothesis. See endnote 11. 

I have always been bothered by the liberal and indiscriminate use of scalar fields in particle theory and 
cosmology. Quantum field theory textbooks start with scalar fields precisely because they are “without 
quality.” If Nature wanted to show us an elementary scalar field, wouldn’t she have shown us one long 
ago? We have encountered elementary spin 1 fields, an elementary spin 2 field, and in a mysterious twist, 
even elementary spin 1/2 fields. We know about meson fields, but they are clearly composite. An interesting 
question might be whether the Higgs field can be regarded as composite. I have speculated elsewhere that 
perhaps quantum field theory somehow forbids elementary scalar fields. In an improved formulation of 
quantum field theory, might elementary scalar fields not be allowed? 

S. Hsu and A. Zee, Mod. Phys. Lett. A20 (2005), pp. 2699-2703, arXiv:hep-th/0406142. 

Invoking an analogy between quantum hydrodynamics and quantum gravity, G. Volovik has argued that the 
cosmological constant paradox could be resolved. Some would maintain, however, that the paradox does not 
depend directly on the quantum nature of gravity, and that gravity merely provides a probe of the fluctuating 
energy in the vacuum. Exploring problems analogous to the cosmological constant in condensed matter 
systems may nevertheless provide a fruitful avenue for further understanding. See G. E. Volovik, arXiv:gr- 
qc/0505104; F. R. Klinkhamer and G. E. Volovik, arXiv:0711.3170. 

See for example A. Zee, Phys. Reports 3C (1972), p. 127. 


Xx 8 Heuristic Thoughts about Quantum Gravity 


In search of quantum gravity 


Almost the entirety of this book is devoted to the classical theory of gravity. The quantum 
appeared on only a few occasions, namely in our discussion of Planck’s natural units way 
back in the introduction to the book, of Hawking radiation in chapter VII.3, (somewhat 
peripherally) of Katuza-Klein theory, and of the cosmological constant paradox in chap- 
ter X.7. I have kept the knowledge required of quantum mechanics and quantum field 
theory to an absolute minimum. However, if I am going to talk about quantum gravity, 
obviously! I will have to mention quantum mechanics and quantum field theory. Given 
that at various points in this chapter I am assuming that you know quantum field theory, 
many readers will have to take my word for it in connection with various statements, but 
I try to minimize the number of these assertions. The reader who has had no exposure to 
quantum mechanics should skip this chapter. 

Of Einstein’s two offsprings, special relativity has been joined with the quantum since 
the late 1940s, leading to relativistic quantum field theory. Meanwhile, general relativity 
has stubbornly resisted being quantized. Even with the intensive development in recent 
decades of two candidate theories, string theory and loop gravity, a theory of quantum 
gravity* remains elusive. Certainly, I do not have room? here to discuss these candidate 
theories. I also do not discuss various other approaches‘ to quantum gravity, notably lattice 
gravity° and the notion of asymptotic’ safety.” Instead, we will chat heuristically about the 
root cause (or causes) of the difficulty in constructing a theory of quantum gravity. 


The appearance of fundamental scales 


The seed of discord between gravity and the quantum had already been sown in the 
introduction to this book. In a world without gravity, that is, a world with h and c, but 
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without gravity (a world in which Newton’s constant G vanishes), we could happily do 
physics using relativistic quantum field theory (and its many limits thereof, such as 
nonrelativistic quantum mechanics or classical mechanics). 


But as soon as gravity enters into the discussion, the Planck mass Mp = fe the Planck 


length Jp = Tee = JA, and the Planck time tp = ‘e = ue burst upon the scene. By 
the way, since by now even mass circulation magazines can mention lightyears without 
explanation,® we can set c = 1 without risk of conceptual confusion. Thus, we henceforth 


work with only the Planck mass Mp = 7 and the Planck length /p = ir = VJhG. As 
G — 0, we note that Mp > oo and /p > 0, and we lose our units. Note also that Mplp = h. 


(As i > 0, Mp > 0 and Ip — 0, as we would expect in the classical world.) 


The Planck mass spells trouble 


To see that the appearance of the Planck mass spells trouble, consider the following 
traditional, and fairly well-known, gedanken experiment. Scatter two gravitons elastically 
off each other. Let us now use high school dimensional analysis to determine the scattering 
amplitude M (a notion you encountered in the preceding chapter), but to do this, you 
would have to know that M is dimensionless,’ something I dare say the typical high school 
student would not know. You have to take my word for it—/V/ is dimensionless. If you don’t 
want to, I will give in appendix 2 a more elaborate version of this argument not dependent 
on this particular piece of knowledge. 

To leading order, M is proportional to G. After all, G measures the strength of the 
gravitational interaction. In the center-of-mass frame, M depends on the center-of-mass 
energy E and the scattering angles. Since the graviton is massless, G and fi are the only 
other quantities MM can depend on. So go ahead, see if you can write down a dimensionless 
function of E and of the scattering angles that is linear in G. 

You are forced to 


M = a(6)GE*/h + O(G?) = a(6) (E/Mp) + 0(G?) (1) 


with a(@) a dimensionless function of the scattering angles 0. 

The important point is that out of G and h, we can form the combination G/h with 
dimensions of inverse mass or energy squared, which in turn requires the amplitude M 
to go like E*. As we crank up the energy E past the Planck energy Mp = re , the amplitude 
increases past unity. But in quantum mechanics, the absolute square of the amplitude gives 
the probability for the process to occur, and probability cannot exceed unity. Hence, the 
leading order expression (1) for M eventually violates the unitarity bound basic to quantum 
physics. 

Perhaps the most concise way of describing this problem is as follows: in a quantum 
world with gravity, the Planck mass sets an intrinsic energy scale, at which something we 
don’t understand is bound to happen. As I have remarked elsewhere,'° I find it sobering 
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that theories in physics have the ability to announce their own eventual failure and hence 
their domains of validity, in contrast to theories in some other areas of human thought. 
So Einstein’s theory is literally crying out, telling us that it will fail around the Planck 
energy. 


Minimum length 


Ever since Louis de Broglie astonished the physics world (and won a Nobel Prize in the 
process!) with his assertion that a particle with momentum p behaves like a wave with 
wavelength of the order fi/ p, particle physicists have been pestering heads of governments 
(and taxpayers) that they need to build larger and larger accelerators to probe shorter and 
shorter distances. Given a beam of particles with energy E, they can probe distances of the 
order Ign ~ h/p ~ h/E. Allowed enough resources in a world without gravity, they could 
keep on increasing the energy and happily probe smaller and smaller distances. 

But in a world with gravity, we have black holes! 

A concentration of mass or energy E in a region smaller than the corresponding 
Schwarzschild radius rs ~ GE is expected to collapse into a black hole. Thus, the col- 
liding beams we use would collapse when GE % lgpg ~ h/E, precisely when E X Mp. This 
suggests that the Planck length /p represents a minimum length below which we cannot 
probe.!? 

In the quantum world, physical quantities are constantly fluctuating. The appearance 
of a fundamental length as soon as gravity is turned on suggests that in quantum gravity, 
spacetime itself is fluctuating on the scale of Jp, thus leading to the picturesque notion of 
spacetime foam. Does this mean, as was just suggested, that Jp represents the minimum 
length that we can probe? It seems plausible, but let’s try to make this assertion somewhat 
more precise. 

Historically, the de Broglie relation grew into the uncertainty principle, which I have 
already mentioned on a couple of occasions (particularly in chapters VII.3 and X.7). 
Starting from the fundamental commutation relation [x, p]= ih between the position 
operator x and the momentum operator p, Heisenberg showed that the uncertainty in 
the position Ax and the uncertainty in the momentum Ap satisfy the inequality 


AxAprh (2) 


Since the uncertainty principle is so fundamental to quantum physics, it would be 
good to recall how it is derived!? and to have a precise statement of it. Given a quantum 
(hermitean) operator A, we define AA = A — (A), a quantity specific to the state in 
which we take the expectation value (A). Then the mean square deviation is given by 
((AA)?) = (A?) — (A)*. The relevant mathematical theorem, which you could look up or 
challenge yourself to prove, states that 


((AA)*) ((AB)) = 4I((A, BY? (3) 
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Measuring device collapsing into a black hole 


When we are faced with a notion that is typically mumbled fast and loose, a good move is 
to call upon our friend the Smart Experimentalist.'* 

We ask SE, “How would you determine if there is a minimum length?” 

SE: “I would take two distance measurements, and try to push the difference between 
them down to arbitrarily small values.” 

So, consider!’ a measuring device of size L and mass M. To determine the position of 
something, you have to know the position of the measuring device. We can also think of 
the object whose position we want to measure as part of the device. SE proceeds to measure 
the position of the device at time 0 and at time f, take the difference s = x(t) — x(0), and 
see if that can be made arbitrarily small. 

For simplicity of analysis, assume that the device can move freely, so that the relevant 
Heisenberg operators are related by 


A 


A nm p 
_ = — 4 
x(t) — x(0) vig (4) 


(If the device is not free, but tied to another mass by a heavy spring, we can always 
regard the mass and the spring as part of the device.) Commuting (4) with x(0), we obtain 
[x (0), £(t)] = if zz. (Note that by assumption, p does not depend on time.) It follows from 
(3) that 


nt \? 
Ax(0))?) ((Ax(t))?) = ( — 5 
((Ax(0))*) (( soy) = (7) (5) 
In other words, x (0) and x(t) form a complementary pair and obey the uncertainty relation 
Ax(O)Ax(t) = HE. 

If SE tries to get the uncertainty in her measurement of x(0) down, the uncertainty in 
her measurement of x(t) necessarily goes up. Thus, try as she may, the uncertainty in her 


measurement of s = x(t) — x(0) is limited by the larger of the two uncertainties Ax (0) 


and Ax(t). The best she can do is bring the uncertainty As down to f, that is, 


As “ (6) 


Now comes the key point. In a world without gravity, SE could make As as small as we 
like. Just take the two measurements quickly in succession and build the most massive 
measuring device (so it does not quantum jiggle too much) the funding agency would 
allow. In other words, make t as small as possible and M as large as possible. 

But now we feel the wrath of Einstein’s two intellectual offsprings! 

Special relativity tells us that t cannot be smaller than the time it takes light to traverse 
the device (otherwise only a part of the device can be regarded as “the device”); so t > L, 
the size of the device. 
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General relativity tells us that if we crank up M too much, the device will collapse into a 
black hole, and we will not be able to receive the result of the measurement. For the device 
not to be a black hole, we require L > GM. 

Thus, we conclude that 


As [a> | > VIG =I (7) 


The Planck length is indeed the smallest distance experimentalists can measure. Note that 

the first inequality comes from special relativity, the second from general relativity. In a 

world with gravity, we cannot measure distances less than the Planck length. Note the 

power of this argument: it does not depend on details of how the device was constructed. 
In appendix 1, I give an alternative argument. 


Black holes are strange 


He who does not believe it owes one dollar. 
—M. Bronstein 


The reader might recognize that all these arguments, including those to be given in 
appendix 1, amount to essentially different versions of the same argument. Ultimately, 
they all come around to the fundamental fact that gravity introduces a natural energy or 
mass scale and a corresponding length scale. This type of argument goes way back, to a 
little-known paper!® published in 1935 by the brilliant Russian physicist Matvei Bronstein, 
who was purged and executed at the age of 31 in 1938. 

Historically, Heisenberg and Pauli quantized the electromagnetic field in 1929, conclud- 
ing rather optimistically that “the quantization of the gravitational field . . . may be carried 
out without any new difficulties by means of a formalism fully analogous to that applied 
here.”!” Ha! Even quantum electrodynamics was not so easy, let alone quantum gravity. As 
you probably know, this early attempt at quantum electrodynamics was afflicted by infini- 
ties and various inconsistencies, difficulties that were not cleared up until the late 1940s 
by the generation consisting of Schwinger, Feynman, Tomonaga, and others. 

But the general belief !® throughout the 1930s was that, once quantum electrodynam- 
ics came under control, quantum gravity would follow readily, with perhaps some trivial 
modifications. With deep insight, Bronstein pointed out emphatically!® that the electro- 
magnetic and the gravitational fields are intrinsically different, because of what was then 
known as the “gravitational radius” of massive objects. 

Black holes are strange, in more ways than one. The founders of quantum physics taught 
us that the quantum size of a particle of mass m is of order i/m: the more massive the 
particle, the smaller it is in the quantum world. But a black hole of mass M has size 
GM =(M/Mp)lp. The more massive the black hole, the larger it is, a behavior that is 
precisely the opposite that of all other particles, including the graviton. This peculiar fact 
underlies the arguments given here. 
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The presence of the Planck length Jp indicates that the theory of quantum gravity, 
whatever it turns out to be, cannot possibly be a quantum field theory. For one thing, 
quantum field theory is based on the notion of local observables, described by fields 
defined at points in spacetime. But with spacetime itself fluctuating wildly according 
to the “dance of the quantum,” we cannot even locate precisely where we are. In other 
words, to formulate quantum field theory, we need slices of spacelike surfaces to succeed 
one another in an orderly progression along a timelike coordinate axis. Bronstein in his 
1935 paper advocated “a radical reconstruction of the theory . . . and the rejection of [a] 
Riemannian geometry, ... and perhaps also the rejection of our ordinary concepts of 
space and time, replacing them by some much deeper and nonevident concepts.””° In 
the early 21st century, string theorists are saying pretty much the same thing. Indeed, you 
can readily understand that with a fluctuating metric, fundamental concepts that we take 
for granted in doing physics, such as the arrow of time, the signature of the metric, and 
the topology of spacetime all become problematic. 


Unitarization and ultraviolet completion 


Interestingly, Fermi’s theory of the weak interaction is also characterized by a coupling 
strength Gr, which has dimensions, in natural units, of inverse squared mass, just like 
Newton's constant G. The same dimensional reasoning that led to (1) can be applied to 
the scattering of two neutrinos, say. Again, we expect something dramatic to happen at the 
energy scale Jé ~ 10? GeV. (Compare this with the Planck mass re ~ 10!° GeV.) 

But in contrast to the case of quantum gravity, we have known what that something is 
since the 1970s, not only theoretically but also experimentally. At that energy scale, the weak 
interaction becomes unified with the electromagnetic interaction into the electroweak 
interaction,”! and unitarity is restored. In this deeper and more complete theory, Fermi’s 
coupling turns out to be Gp ~ e*/M2.,, where e denotes the electromagnetic coupling 
constant and My, the mass of the intermediate vector boson responsible for the weak 
interaction. A fashionable terminology is that the electroweak interaction “ultraviolet 
completes” the weak interaction. 

Unfortunately, the lesson we learned in dealing with the weak interaction does not 
appear to carry over to quantum gravity. At the moment, we do not know what the ultraviolet 
completion of quantum gravity might be. If it turns out to be string theory, the mechanism 
is to replace the graviton by a closed loop of vibrating string.” 

An interesting possibility is that the scattering of gravitons at high energies is unitarized 
by the formation of a black hole. Our understanding of black holes certainly leads us to 
expect that when an amount of energy E is dumped into a region of size GE ~ (E/Mp)Ip 
(namely the Schwarzschild radius corresponding to £), a black hole will form. Explicit 
calculation”? shows that this indeed occurs. Indeed, we have already used this “fact” a 
number of times in this chapter. 
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Historically, when physics ventured into atomic distance scales, classical physics gave 
way to quantum physics. Various classical theories were quantized. Intriguingly, it may 
be that when quantum gravity enters into Planck distance scales, quantum physics will in 
turn be replaced by classical physics somehow. G. Dvali and his collaborators have referred 
to this possibility as the classicalization of gravity.2* Quantum gravity may be classicalized. 
All these are just words, of course, at this point. 


Effective field theory of gravity 


As mentioned, historically, people were upset by the divergent behavior of quantum gravity 
treated perturbatively in G. One practical attitude is: “Who cares if the scattering amplitude 
goes bad at energy of the order of Mp? As long as we deal only with low energies, the 
theory works perfectly well.” More formally, this attitude is embodied in the more modern 
outlook of effective field theory, as described in chapter X.3 (and also to be mentioned in 
appendix 2). Recall that in this view, the Einstein-Hilbert term is the first in an infinite 
series of terms R + Mp 7(aR? + BRR!” + VRyypoR“P7) +--+ in the action. As the 
energy E in the scattering process approaches Mp, the higher derivative terms kick in. 
But since they appear with coefficients of order M;’, their effects are suppressed by 
(E/Mp)* and so are entirely negligible until E ~ Mp. This is of course just another 
way of saying that we can pretty much forget about quantum gravity in our low energy 
world. 

I think that a rough analogy might be the following. Suppose that some other civilization 
had a rudimentary understanding of quantum physics, say at the level attained circa 1910 
in our civilization, not long after Maxwell wrote down his theory of electromagnetism. The 
perturbative correction to the electromagnetic scattering between point charges (call them 
electrons) also grows with energy, although only logarithmically (like a log(E/m,), where 
a ~ 1/137 measures the electromagnetic interaction strength). So at some point, we also 
lose control over our scattering amplitude when the correction becomes of order unity, 
namely when the energy approaches E ~ e!?’m,. People might wring their hands over the 
“inconsistency” of quantum electrodynamics until their hands got all swollen, but as we 
know in our civilization, this difficulty was eventually resolved and revealed to be totally 
harmless. 

Thus, more recently, the focus has been less on the divergent behavior of quantum 
gravity, but rather on the strange behavior of black holes. Yes, black holes are strange, as 
we have seen again and again. As a reminder, consider once again the bizarre behavior of 
black hole entropy S. We already noted way back in the introduction to this volume that, by 
dimensional analysis, S ~~ GM* = (GM)*/G ~ A/I3. Again, the puzzling fact that it goes 
like the area follows almost immediately”? from the fact that G defines a natural length 
scale. As discussed in chapter VII.3, there has been some progress on the question of 
where black hole entropy, which certainly does not leap out at you from the Schwarzschild 
solution, comes from. 
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Quantum gravitational corrections to the Newtonian potential 


It is important not to get the impression that the bad high energy behavior of gravity pre- 
vents us from calculating anything. In fact, for measurable physical processes, it should be 
possible to segregate the high energy contribution from the low energy contribution, using 
an effective field theory type of approach. For instance, in quantum field theory, the New- 
tonian potential V = —Gm,m,/r between two particles of mass m, and m) results from 
the exchange of a single graviton between the two particles.”° If you know what Feynman 
diagrams are, you can easily draw diagrams in which two gravitons are exchanged. These 
have been calculated to give the result?’ 


Vejse2m 4 3+ md) 41 Gh — 
r cr 10 c3r2 


(8) 


I have restored c and hi, so as to indicate which corrections go away in the limits h > 0 
and c — oo. Note that the quantum correction is of the form (Jp/r)*. If experimentalists 
could measure these miniscule corrections, this would represent an eminently falsifiable 
prediction of our understanding of the low energy effects of quantum gravity. 


Unification with the other three interactions 


[already mentioned in passing the so-called fine structure constant a = < ~ 1/137 intro- 
duced by Sommerfeld”* in 1916. This quantity characterizes the coupling strength of the 
electromagnetic interaction and is known as the electromagnetic coupling constant, except 
that we have now understood for a long time that it is not constant.* Our friend SE would 
measure electromagnetic coupling strength by scattering, say, two electrons off each other. 
So clearly w(E) is a function of the energy involved.”? (This fact was not apparent before 
high energy accelerators were built, because physicists had explored only a tiny range of 
energies over which a(£) was approximately equal to a(E = 0) ~ 1/137.) 

The coupling strengths of the strong and the weak interactions are characterized sim- 
ilarly. Hence, we have three coupling functions!’ a,(E), a,(E), and a3(£) all varying 
logarithmically*° with energy E. I had mentioned (in chapter VIII.3, chapter X.1, and 
elsewhere) that the three nongravitational interactions have been unified. One indication 
of the unification is that these three coupling functions meet at grand unified theory energy 
scale Egy ~ 10'° GeV (as already alluded to in chapter VIII.3). In other words, although 
a,(E), a2(E), and #3(£) are quite different in our low energy world, they become equal in 
the grand unified world. (I am necessarily painting the picture with broad brush strokes 
here, omitting all the ifs and buts.) 


* Hence the term “coupling constant” belongs with “recombination” (see chapter VIII.3) on the list of top ten 
worst physics terms. 
+ They are not called Ostrong(E), Mweak(E), and Oclectromagnetic(£) for reasons I do not go into here. 
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So, how does gravity fit into this picture? The standard answer is that it does not. For 
one thing, gravity is exceedingly feeble compared to the other three interactions. 

But as you can see, in the context of this discussion, the bad behavior in (1) tells us 
that the coupling strength of gravity should be, in some sense, wg (E) = GE” rather than 
just G. If so, then, as we go up the energy scale, the gravitational coupling will shoot up 
compared to the three couplings ambling along logarithmically. Thus, all four coupling 
functions could become equal, so that the four interactions we know, love, and fear could 
become unified into one happy theory at the Planck energy Mp. Indeed, people have also 
speculated about the effect of the opening up of higher dimensions on ag(E). 

People often confound the taming of the bad high energy behavior of gravity and its 
possible unification with the other three interactions. The two issues are logically distinct. 
While it would be nice indeed to achieve both of these objectives within one elegant theory, 
we should keep in mind that it is possible to have the first without the second. 


Discord between Einstein gravity and the quantum 


In the discord between Einstein gravity and quantum physics, somehow it is gravity that 
gets blamed. Most attempts to reconcile the two have involved modifying or extending 
Einstein gravity. For example, string theory is formulated by assuming that quantum 
physics as we know it will continue to hold all the way up to the Planck energy. It is of course 
entirely possible that it is quantum physics that has to be changed. People have suggested 
the breakdown of quantum mechanics at high energies, but it is entirely possible that it 
fails in some hitherto unexplored regime. 

One thought that appeals to me is that quantum mechanics as we know it breaks down 
when the splitting between energy levels AE is less than the inverse of some cosmological 
time scale, such as the age of the universe. 

Certainly people have tried for a long time to change the rules of quantum physics as laid 
down around 1926, but it turns out to be extraordinarily difficult to produce a consistent 
and compelling extension that does not run into some kind of contradictions or difficulties. 
You are of course free to speculate. 

A dissenting attitude, perhaps articulated most forcefully by Freeman Dyson, is that 
gravity should not be quantized at all. I will let Dyson speak for himself. 


The essence of any theory of quantum gravity is that there exists a particle called the graviton 
which is a quantum of gravity, just like the photon which is a quantum of light. It is easy to detect 
individual photons, as Einstein showed, by observing the behavior of electrons kicked out of 
metal surfaces by light incident on the metal. The difference between photons and gravitons is 
that gravitational interactions are enormously weaker than electromagnetic interactions. If you 
try to detect individual gravitons by observing electrons kicked out of'a metal surface by incident 
gravitational waves, you find that you have to wait longer than the age of the universe before 


you are likely to see a graviton. If individual gravitons cannot be observed in any conceivable 
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experiment, then they have no physical reality and we might as well consider them non-existent. 
They are like the ether, the elastic solid medium which nineteenth-century physicists imagined 
filling space. Einstein built his theory of relativity without the ether, and showed that the ether 
would be unobservable if it existed. He was happy to get rid of the ether, and I feel the same 


way about gravitons. According to my hypothesis, the gravitational field described by Einstein’s 


theory of general relativity is a purely classical field without any quantum behavior. 


Note that Dyson is dismissing quantum gravity because of its weakness, but gravity 
is weak precisely because Mp is so huge, as we have seen since the introduction to this 
volume. So it is basically the same attitude mentioned above: “Thanks but no thanks, we 
are already quite happy with our low energy calculation of, say, the perihelion shift of 
Mercuty.” 

You, I, and everybody else—we are all free to form our own opinions. I would take 
issue with the statement “If individual gravitons cannot be observed ... we might as 
well consider them nonexistent.” Imagine uttering this sentence in the 19th century 
with the word “atoms” substituted for “gravitons.” As it turned out, the concept of atoms 
eventually did lead to a much deeper understanding of nature. Perhaps Dyson is advocating 
a utilitarian philosophy here. What does it buy us? Would quantizing gravity lead to a 
deeper understanding of nature? We will have to see, evidently. 

I discuss a bit more in appendix 3 whether gravity must be quantized. 


Appendix 1: More handwaving arguments 


Here I recount briefly an argument given by Mead.3? According to the textbook argument leading to the 
uncertainty principle, to localize a particle to within Ax, we need to use a short wavelength, high frequency 
photon with energy E satisfying Ax ~ h/E. This is all fine in a world without gravity, but in a world with gravity, 
the photon will exert a gravitational force on the particle, causing it to accelerate with acceleration a ~ GE/r?. 
Here r denotes a vaguely defined characteristic distance describing the interaction between the photon and the 
particle. (Fortunately, r will drop out.) The photon traverses this interaction region in time r, during which the 
particle acquires a velocity v ~ ar and travels a distance d ~ ur ~ ar? ~ GE. Combining this with Heisenberg’s 
uncertainty principle, we conclude that our knowledge of the position of the particle is limited by what might be 
called the generalized uncertainty principle 


h 
Rohe (9) 
E 


In other words, in addition to the uncertainty imposed by the wavelength of the photon, the particle we are trying 
to observe has also moved due to its gravitational interaction with the photon. Minimizing this, we see that the 
best we can do is Ax ~ VG = Ip. Again, notice that we have implicitly used special relativity here, equating the 
gravitational mass with energy. 

To me, this kind of handwaving argument is rather fast and loose, and should (and could) be refined. Indeed, 
Mead did refine his argument, first taking into account momentum conservation and then replacing Newtonian 
gravity by Einsteinian gravity. 

Another argument given by Mead (and already mentioned in the text) involves an attempt to confine a particle 
to a small region of size s. By the uncertainty principle, the energy of particle E X p ~ ii/s. For this region not 
to become a black hole, we require s > GE ~ Gh/s, giving s VAG = Ip. 

Interestingly, string theory also naturally leads to (9). Imagine using a graviton, allegedly a closed loop of 
string. As we pump energy into it, it expands to have size GE, thus accounting for the term in (9) that is added 
to Heisenberg’s term. 
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Appendix 2: Failure of the perturbative expansion and effective field theory 


Suppose you refuse to let me simply assert, as in the text, that the amplitude for the elastic scattering of 
two gravitons is dimensionless. In that case, I will start with the weaker statement that M « G, but now let us 
calculate the order G? correction to this amplitude: 


Ma {1+ 6GE*/n + 0 (G?)| =G [14 p (E/Mp)’ + 0 (c?)} (10) 


with 6 some dimensionless function of the scattering angles. By definition, the correction to the 1 in the curly 
brackets is linear in G, and so the correction term has to go like E 2. 


As we crank up the energy E past the Planck energy Mp = re , the second order term becomes larger than 
the first order term. We lose control over the perturbative expansion. Again, you recognize this as essentially the 
same argument given in the text: gravity introduces an energy scale Mp. 

Here are a few comments on this argument: 


1. Historically, this argument is confusingly phrased in terms of infinities. In more modern treatments of 
quantum field theory, there are no infinities in physics, only cutoffs? The more sensible way to regard 
the difficulty we face is the violation of unitarity, as was explained in the text. 


2. We can readily extend this argument to cover the higher order terms. Thus, the O(G?) terms in (10) 
must have, again by dimensional analysis, the form y(E/Mp)*, with y some other dimensionless 
function of the scattering angles. 


3. One possible reaction to this unitarity argument could be “So what? The perturbation expansion fails.” 
It is certainly possible that one day, but a day that theoretical physicists can only dream of at this point, 
we will know how to treat quantum gravity nonperturbatively. The series in the curly brackets in (10) 
might turn out to be the expansion of a function f((E/Mp)*), which behaves with decency even for 
E & Mp. Itisalso possible that the function is nonanalytic and does not admit a perturbative expansion. 
But these are merely words. 


4. As mentioned in the text, the modern view is to regard the series in (10) as due to some effective theory 
of the type described in chapter X.3. In quantum field theory, powers of derivatives in the action get 
converted into powers of momentum or energy in the scattering amplitude. 


Appendix 3: Induced gravity 


The perennial question of whether gravity must be quantized has a long history. Here we give an extremely 
schematic overview. First, you may know that there are three equivalent formulations for quantum physics, 
known as the Heisenberg, the Schrédinger, and the Dirac-Feynman pictures. 

In the Dirac-Feynman or path integral formulation, one integrates over e!5(9)/h where S denotes the classical 
action as defined in part II of this book, over all possible paths or histories that the dynamical variable g(t) can 
follow. In other words, one has to evaluate an integral of the form f DqeS@/", In the limit i > 0, classical 
physics is recovered by evaluating the integral in the stationary phase approximation.** 

The problem of quantum gravity can thus be stated as follows. Let the world be described by a set of 
“matter fields’ y and the metric g,,,. We envisage doing a giant integral of the form (we have set h to 1) 
[ Dg f Dye SeH9)+58.)), where Spy(g) denotes the Einstein-Hilbert action and S(g, yr) the action for the 
fields in a spacetime described by the metric g. The general strategy is to do the integral in two stages, that is, to 
write itas [ Dge’SEH®) [ Dye), 

At this point, quantum field theorists and people of that ilk claim that they more or less know how to do the 
integral over w, including the electron field, the quark fields, the gauge fields, and so forth. The difficulty of 
quantum gravity amounts to doing, or even defining, the integral over g. 

The proposal of induced gravity is simply this: what if we don’t integrate over g? 

Since S(g, w) is invariant under general coordinate transformation, we are guaranteed that f DwyeiS& Wis 
also invariant under general coordinate transformation, and hence must have the form 


/ Dyei@W = oi f dix V=B(A+MBR+-) (11) 
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with some mass scale Mp as mandated by dimensional analysis. In other words, the integration over y produces 
the effective field theory of gravity as described in chapter X.3. 

There is no question that integration over the matter fields y will generate the Einstein-Hilbert term, that is, 
the scalar curvature R: this is merely a consequence of general symmetry considerations. The difficulty is that it 
is accompanied by A, which comes out naturally large, of order M4, but this is of course just the cosmological 
constant problem biting us again. Another difficulty is that the classical equation of motion of the gravitational 
field no longer emerges automatically as Planck’s constant approaches zero, but has to be introduced by hand. 
Perhaps there is nothing wrong with this, but it sure is unattractive. 

A more extreme version of induced gravity is that we don’t even have to include S,;: Einstein gravity is 
induced by the dynamics of the matter field. An analogy would be an elastic medium, such as the vibrating 
string or membrane we talked about in part II. We can certainly write down an action for an elastic medium, 
but we know perfectly well that this action is not fundamental. It is produced or induced by microscopic physics. 
Similarly (almost blasphemous to say), perhaps gravity is not a fundamental interaction but is induced by the 
other three interactions. The fact that gravity stands apart from the other three can be regarded as supportive of 
this view. 

At one time, induced gravity appeared to offer a way out of our problems with gravity and thus enjoyed a 
following. But not quantizing gravity leads to other problems, as we will see in the next appendix. 


Appendix 4: Gravity as a classical probe 


In the Heisenberg picture, classical observables are replaced by quantum operators. In particular, the quantities 
appearing in Einstein’s field equation are to be treated as quantum operators. Thus, not quantizing gravity means 
that we continue to regard By as classical, but we treat Ty» (which is constructed out of other fields, such as the 
electromagnetic field) as a quantum operator. C. Moller in 1962 and A. Rosenfeld in 1963 proposed the equation 


Ruy — 38 nvR = 82 (state|T,,,|state) (12) 


In other words, instead of a quantum operator, the right hand side is to be replaced by the expectation value of 
the quantum operator in some state. 

Once again, if we invoke the naturalness dogma, this produces a huge cosmological constant on the right 
hand side: (state|T,,,,|state) = Ag, +--+. But let us leave that aside. The objection to this equation is that it 
violates the uncertainty principle. 

If gravity is not quantized, then it acts as a classical probe, and we could use a massive ball attached to a torsion 
balance to measure the position and momentum of a passing electron. In other words, the uncertainty principle, 
if strictly interpreted, does not allow the world to be part quantum and part classical. Conceptually, there may 
be nothing wrong with this. Let the uncertainty principle hold only in the quantum world. In his reasoning, 
Heisenberg used only quantum probes. 

However, in 1981 Page and Geilker*? experimentally demonstrated the difficulty one runs into. Consider a 
Cavendish experiment in which the heavy ball is moved from one position “here” to another position “there,” 
as determined by some radioactive decay. This amounts to a Schrédinger’s cat experiment with the quantum 
state in (12) given by |state) = wel 
situated halfway between here and there.>® 


|here) + |there)). The torsion pendulum would then point to a phantom ball 


Appendix 5: Quantum particles in a classical gravitational field 


This may not be the best place, but it will have to do, to mention that a series of elegant and fascinating 
experiments*” were performed, starting in the 1960s, to study the behavior of quantum particles in a classical 
gravitational field, such as that of the earth. Typically, a beam of neutrons is split into two, with one sub-beam 
made to travel at a higher altitude than the other. The two sub-beams are then allowed to come together and 


interfere. Thus, theoretically, we are to study the Schrodinger equation (- wy + mgz) ih = Ew, with z the 


vertical axis. In another type of experiment, neutrons are literally bounced off the floor like basketballs. 
While these experiments confirm quantum mechanics resoundingly, they do not shed any light on quantum 


gravity. 
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Appendix 6: Absence of local observables in quantum gravity 


Another difficulty with quantum gravity is the absence of local observables. Heuristically, due to the possibility 
of making general coordinate transformations, we can move spacetime points around. People sometimes say, 
rather sloppily, that in an observable O(x), the x cannot be specified. I will try to be more specific. 

To read this appendix, you need to know that in quantum physics, observables are represented by operators 
O(x) inthe Heisenberg picture mentioned in appendix 3. Our task is to calculate the functions** G(x,, +++, x,) = 
(O(x1) --- O(%,)) defined as the expectation value of a product of operators in the vacuum state. In the Dirac- 
Feynman or path integral formalism, we do not speak of operators, but instead of functional integrals, as was 
also mentioned in appendix 3. The expectation value (0 (x) -°- O (x,)) is then represented by an integral, so that 
G(x, +++, X,) = f DgDo --- e584") O(x1) --- O(x,). Physically measurable quantities, such as scattering 
cross sections, are determined by the functions G(x,,---,x,). Dear reader, if you have had no exposure to 
quantum physics, and if this paragraph is complete gibberish to you, then you should certainly skip this appendix. 

After this brief preliminary, let O(x) be a scalar observable; in other words, if we make a general coordinate 
transformation, then we have O'(x’) = O(«). Specifically, if O(x) is constructed out of g,,,(x), a scalar field (x), 
and so forth, then O’(x’) is constructed out of ey (x’), 6'(x’) = o(x), and the like. Then we have 


G (x1, +++, Xn) = (0 (x) 0 (xn)) 


=/ Dg Do = ++ !5&P 0 (x4) «+ O (Xn) 
=| DgDo ---e!8 0! (x1) --- O' (x)) 


=| Dg'D¢" ---€'58°#"“ 0! (x1) --- 0! (x1) 


/ Dg Dg ---e!@P0 (x1) + O (x4) 


SEG eth) (13) 


Note that the first equality (not counting the definitions) is just O’(x’) = O(x). The second equality follows from 
the invariance of the action S(g’, g’, -- -) = S(g, ¢, -- -) and of the integration measure Dg’ Dd’ = Dg D¢ under 
general coordinate transformation. The third, and crucial, equality is due to the elementary calculus theorem 
stating that under integrals, we can rename the dummy integration variables at will. Note, however, that we do 
not erase* the primes on x/,---, a The equality G(x,,---,x,) = G(x}, see x)) implies?? that G(x, +++, X,) 
does not depend on its arguments and thus can only be a constant. Thus, quantum gravity cannot be based on 
local observables, but instead has to be built out of nonlocal observables. This statement provides one of the 
starting points of the approach to quantum gravity known as loop quantum gravity. 

The argument depends essentially on the fact that quantum physics involves an integration over all configu- 
rations. 


Notes 


1. I try hard to avoid the use of words like “obviously” in my textbooks, but surely the reader would agree that 
my usage of the dreaded word here is justified. 

2. For those readers who want to get into the subject, a recommended starting point is A. Strominger, “Five 
Problems in Quantum Gravity,” arXiv:0906.1313 (2009). See also Approaches to Quantum Gravity: Toward a 


* Compare and contrast with the discussion in appendix 1 to chapter VI.4. 
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New Understanding of Space, Time and Matter, ed. D. Oriti (Cambridge University Press, 2009) for a wide 
variety of viewpoints. 


. Not to mention my assumption that the typical reader of this book has only a limited knowledge of quantum 


physics. 


4. For a list, see, for example, p. 5 in C. Rovelli, Quantum Gravity, Cambridge University Press, 2007. 


13. 
14. 


15. 


16. 
17. 


18. 


19: 


20. 


21. 


22. 
23. 


24. 
25. 


26. 
27. 


28. 


. In one sentence: spacetime is discretized and the distances between lattice points are dynamical. To get into 


the literature, look at review articles by R. Loll and others. 


. The word is appropriate for a book on gravity: from a-sym-ptotos = falling together. See the lament of Qfwfq 


in chapter IX.3. The reference to falling persists in the medical term ptosis, a drooping of the upper eyelid. 


. In one sentence: as advocated by S. Weinberg (http://arXiv.org/abs/arXiv:0908.1964), quantum gravity may 


be governed by an attractive ultraviolet fixed point at some finite value of the coupling. To me, it is an 
attractive idea, but unfortunately, to explain it properly, I would have to assume a great deal of knowledge 
about quantum field theory, particularly the notion of the renormalization group. To get into the literature, 
look at review articles by M. Reuter and others. 


. But Lightfoot, as in Gordon Lightfoot, is not a unit of time, since foot is not necessarily a unit of length. 

. See any textbook on quantum mechanics and quantum field theory, for example, QFT Nut, pp. 139ff. 

. OFT Nut, p. 172. 

. In 1929, a year after the death of his mother, who thought that her youngest son would never amount to 


anything. Born in 1892 and having died in 1987, he was one of the longest-lived theoretical physicists. 


. As is appropriate for a textbook, I am presenting the standard mainstream view here. The statement 


that we cannot go smaller than /p is far from settled and has a controversial literature. For a small sam- 
pling, see H. Salecker and E. P. Wigner, Phys. Rev. 109 (1958), pp. 571-577; R. Gambini and R. Porto, 
http://arXiv.org/pdf/gr-qc/0603090.pdf, sec. II; Y.J. Ng, Ann. N.Y. Acad. Sci. 755 (1995), pp. 579-584; R. Gam- 
bini, J. Pullin, and R. Porto, http://arXiv.org/abs/hep-th/0406260; and G. Amelino-Camelia and L. Doplicher, 
Class. Quant. Grav. 21 (2004), pp. 4927-4940, hep-th/0312313. 

For example, J. J. Sakurai and J. Napolitano, Modern Quantum Mechanics, p. 34. 

She helped us crucially in understanding renormalization in QFT Nut, chapter III, and has already appeared 
in this book on several occasions. 

I adapted this argument from X. Calmet, M. Graesser, and S. D. H. Hsu, Phys. Rev. Lett. 93 (2004), to which 
I refer the reader for caveats and further details. 

See G. Gorelik, Physics Uspekhi 48 (2005), p. 1039. 

Translated from W. Heisenberg and W. Pauli, “Zur Quantendynamic der Wellenfelder,” Zeit. Physik 56 (1929), 
p. 3 [in German]. 

Keep in mind the enormous confusion at the time, such as Bohr’s proposal that energy is not con- 
served, and the issue of whether the uncertainty principle could be applied to fields. L. Rosenfeld was 
apparently the first to show that quantum field theory and classical relativity are not consistent together: 
http://www.sciencedirect.com/science/article/pii/0029558263902797. 

He ended his paper with the statement “Wer’s nicht glaubt, bezahlt einen Thaler.” [He who does not believe 
it owes one dollar.] Compare the stories of J. Grimm and W. Grimm. 

Translated from M. Bronstein, “Quantization of Gravitational Waves,” J. Expt. Theor. Phys. 6 (1936), p. 195 
{in Russian]. 

See any number of textbooks on particle physics for an explanation. For a concise summary, see QFT Nut, 
chapter VII.2. 

See, for example, J. Polchinski, String Theory. 

By D. Eardley and S. Giddings, and by S. Hsu. For one speculation on what high energy scattering of gravitons 
may look like, see S. Giddings and R. Porto, http://arXiv.org/abs/0908.0004. 

You may wish to look at some of G. Dvali’s papers in the physics archive. 

We need to argue from the expectation that S > 0 as G — 0 that S is linear in G. See the introduction to 
this book. 

See, for example, QFT Nut, chapter I.5. 

N. E. Bjerrum-Bohr, J. F. Donoghue, and B. R. Holstein, arXiv:hep-th/0211072. The earlier literature can be 
traced from this paper. In particular, the philosophy of treating general relativity as an effective field theory 
was outlined by J. F. Donoghue, arXiv:gr-qc/9405057. 

A story about fi: I once stayed at a physics institute in Munich, where a commemorative metal plaque 
inscribed with Sommerfeld’s formula was set in the lobby. Some friends who were not physicists came 
to visit, and I asked them what the funny symbol /i meant. The craftsman carved the plaque in such a way 


774 | X. Gravity Past, Present, and Future 


29. 
30. 


31. 
32. 
33. 
34. 


35. 
36. 
37. 
38. 


39. 


40. 


that the bar in “h bar” was a short horizontal line, which crossed the long vertical line that forms the spine 
of the letter h. A German woman, evidently an antinuclear and peace activist, immediately responded that 
physicists were contrite about inventing nuclear reactors: the “rounded arch” in h represented a nuclear 
reactor, right next to which was erected a Christian cross memorializing all the people physicists had killed 
indirectly. Very creative deconstruction of hi! 

See, for example, QFT Nut, p. 164. 

This is related to the assertion in the preceding section that the perturbative correction to electromagnetic 
scattering grows logarithmically with energy. 

F. Dyson, “The World on a String,” New York Review of Books, May 13, 2004. Copyright © 2004 by Freeman 
Dyson. 

C. Alden Mead, Phys. Rev. 135 (1964), p. B849. 

This point is explained here. See, for example, QFT Nut, chapters III.1 and III.2. 

This description is of course way too schematic for anyone not already familiar with the subject. For a brief 
introduction, see QFT Nut, chapter 1.2. 

D. N. Page and C. D. Geilker, “Experimental Evidence for Quantum Gravity,” Phys. Rev. Lett. 47 (1981), p. 979. 
I don’t know enough about quantum measurement theory to decide whether I should worry about this. 
For a detailed review, see D. M. Greenberger and A. W. Overhauser, Rev. Mod. Phys. 51 (1979), p. 43. See also 
the textbook on quantum mechanics by J. J. Sakurai and J. J. Napolitano. 

Known as some form of Green’s function. But I don’t want to confuse some readers even further with 
unnecessary names. 

The crucial point is that general coordinate transformations form a very large set of transformations indeed. 
If we only have invariance under, say, translation, namely xi =x;+a, then we can only conclude that 
G(x1, +++, X,) is a function of x1 — x,, X72 —%Xy,°°+»Xp-1— Xp, aS usual. Similarly for invariance under 
Lorentz transformation. 

R. Gambini and J. Pullin, A First Course in Loop Quantum Gravity, Oxford University Press, 2011. 


Recap to Part X 


As I warned you in the preface, part X contains more speculative topics, including some 
that may not be of lasting value. 

I like the Katuza-Klein idea so much that I will be disappointed if Nature does not make 
use of it somehow. I feel the same way about twistors, but less strongly. The treatment 
of finite sized objects, the effective field theory approach to physics, and topological field 
theory are all topics that in all likelihood will last. 

In contrast, the chapters on the cosmological constant paradox and on quantum gravity 
are wildly speculative and, some might say, do not belong in a textbook. But I disagree: 
textbooks should not consist exclusively of material that has been carved in stone, or even 
worse, embalmed. 


| Closing Words | Words 


| admire Einstein’s theory of gravity as a work of art. 
—Max Born 


In his last years, as | knew him, Einstein was a twentieth 
century Ecclesiastes, saying with unrelenting and indomitable 


cheerfulness, “Vanity of vanities, all is vanity.” 


—Freeman Dyson! 


Here I collect some closing words,” a few random thoughts that constitute neither a 
conclusion nor a summary, just a bit of purple prose. 

Theoretical physicists have been bowled over, not only by the aesthetic appeal and 
observational successes of Einstein gravity, but also by its profound impact. As we have 
seen, Einstein’s theory is characterized by four fabulous features: 


1. its mathematics is strikingly elegant; 
2. its input consists of one single long-established fact that would otherwise be deeply puzzling; 
3. its predictions were immediately verified; and 


4. it has profound things to say about our understanding of the world, the very nature of 


spacetime. 


As mentioned in chapter V.2, my enthusiasm is based not merely on the impact of 
Einstein gravity on physics, but also on the impact of Einstein gravity on how we do 
theoretical physics. With its great success, Einstein gravity has in time become a model 
for theoretical physics. It remains to be seen how fruitful this approach will prove to be, 
but there is no denying its appeal for theoretical physicists. Latch onto a well-established 
but not understood physical fact, start with an attractive mathematical framework, get the 
whole enchilada in one fell swoop, and enjoy a dramatic, almost immediate, confirmation. 
When this approach works, as it did for Einstein, it’s fabulous, no question. 
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We may call this the Einstein mode of theoretical physics, exemplified later by Dirac, 
for example, albeit at a somewhat lower level. Theoretical physicists have strived for this 
ideal? ever since, but thus far, always stumbling on one or the other of these four features. 


The most beautiful experience we can have is the mysterious. It 
is the fundamental emotion that stands at the cradle of true art 
and true science. Whoever does not know it and can no longer 
wonder, no longer marvel, is as good as dead. 


—A. Einstein 


Einstein revealed to us two mysteries, the mystery of gravity and the mystery of the cosmos. 
The two are logically separate, but we feel vaguely that they are somehow intimately linked. 
At our present level of understanding, while the universe certainly needs gravity, gravity 
appears to be indifferent to the universe: gravity operates within the universe, growing 
structures and making it expand. All essential tasks as far as the universe is concerned, but 
even if the universe consists of only the sun and the earth and nothing else, the Einstein- 
Hilbert action could still work its magic. We don’t even know what a linkage between the 
two mysteries might look like, if there is one. The Dirac large number hypothesis—that 
fundamental constants could conceivably age with the universe—may offer a primitive 
example of this. To me, a decaying cosmological constant might be an attractive resolution 
of the cosmological constant paradox. 

The cosmological constant paradox may or may not be the key to a deeper understanding 
of gravity, but let us hope that the dark energy is not due to a random bunch of scalar fields.* 
Einstein said, “Physics should be as simple as possible, but not any simpler.” To this, we 
say, “The solution to the cosmological constant paradox should be as crazy as possible, but 
not any crazier.” 

Is our present understanding of cosmology too simple? With a first order ordinary dif- 
ferential equation and the liberal use of the Gamow principle, we have conquered the 
universe! It certainly cannot be denied that we have achieved an astonishingly quanti- 
tative understanding of cosmological data.> In a way, it is cause for celebration. Physics 
triumphs. But if there is no more cosmic mystery, might that not cause a sense of bitter 
disappointment, even among the most rationalist physicists? Is that all there is to it? 

Perhaps our present cosmology has already been made too simple for what Einstein 
would have liked: is it simpler than “as simple as possible”? 

So I am glad, and I suppose many others are also, to feel that, in spite of the fabulous 
success of the standard model of cosmology,® a sense of unease remains. I suspect that 
many, deep down, are elated by the coincidence problem.’ In the unfathomable and 
unceasing parade of the eons, why® now? 

There is something seriously incomplete in our understanding. Both dark matter and 
dark energy were largely unexpected and are an embarrassment for particle physics, but it 
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is by and large the particle physicists who think that almost everything is understood, not 
the astronomers and the cosmologists. 

Einstein spoke of the asylum more than once. We have grown to be far more self- 
confident and conceited than he. 

Wigner once wrote about the unreasonable effectiveness of mathematics in physics (see 
also chapter VII.3). Here we could speak about the unreasonable effectiveness of physics 
in understanding the universe. A first order ordinary differential equation suffices! Of 
course, we understand this as a consequence of the perfect cosmological principle,’ but 
still. 

Is the universe we understand not the whole universe, but the universe filtered through 
the human mind? The distinguished cosmologist E. R. Harrison, in his final book Masks 
of the Universe, suggests that our current cosmology, with its dark and light sectors, was 
yet another mask obscuring the true universe. My impression is that most in the physics 
community are inclined to dismiss Harrison, but I sympathize with his mystical views to 
some extent. 

The universe may have its own mysteries that gravity knows not. 


Wheeler once argued that the universe could only be closed. Open and flat universes 
troubled him, Wheeler said, because the infinite space implied that everything that could 
have happened would have happened. Another version of you would have read not only 
this book, but also this book in all its many drafts. The same argument could have been 
invoked to rule out the flat earth with its edge infinitely far away. As we go on an endless 
quest to the edge, we would have encountered every imaginable goblin and monstrosity. 
Just as some films are not suitable for young minds, infinity is not suitable for the human 
mind. In theoretical physics, we must have cutoffs. 

With an accelerating expansion, however, the universe separates into isolated regions 
that cannot communicate with each other. So, Wheeler’s concern might be mollified. 

We are often awed by the power of aesthetic or “philosophic” arguments, but on the 
other hand, we should not succumb to selective bias and fail to remember the ones that 
failed. But has Wheeler’s argument really failed? No. The universe has not been proven to 
be flat.1° 

The underlying algebra of physics was extended from the Galilean algebra to the Lorentz 
algebra. As the flat earth once gave way to the round world, will the Lorentz algebra be 


replaced by the de Sitter algebra?!" 


As for the mystery of gravity, let us not forget that we actually have a pretty good under- 
standing of gravity; in particular, the connection with spacetime curvature is nothing short 
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of astonishing. We have merely gotten used to it. But by its very nature as a quest, theo- 
retical physics always wants more. Mainly, theorists would like gravity (1) to be quantized, 
and (2) to be unified with the other interactions. 

The internal consistency of physics appears to demand the quantization of gravity. Some 
simply cannot tolerate to see the world split up into two pieces, a quantum world and a 
classical world. The best argument I have heard is that the uncertainty principle would be 
violated if gravity is not quantized: gravity could then be used as a classical probe.'? But 
dissenters abound.'? One line of thought is that gravity may not be fundamental; another, 
similar, line of thought is that gravity may be induced. Then there is the Dysonian view 
that the quantization of gravity is inconsequential" for physics. 

We talked about quantum gravity in chapter X.8—a mishmash of thoughts about quan- 
tum gravity. An intriguing possibility is that when we get into the Planckian domain, we 
will have to classicalize physics, in contrast to that previous occasion when we got into the 
atomic domain and had to quantize physics. 


While the quantization of gravity may be required, the unification with the other three 
interactions does not appear to be.'° Gravity stands apart from the other three interactions. 
As Einstein said, the gravitational field is first among equals. While the other three 
interactions operate within spacetime, Einstein gravity is spacetime. For me (and of course 
also for many other theoretical physicists), the most puzzling aspect of Einstein gravity is 
its ability to alter the causal structure of spacetime completely. When we focus on gravity as 
a small perturbation on Minkowski spacetime, as in chapter IX.4, then it behaves just like 
the other three interactions. When quantized, gravitational waves give rise to the graviton, 
which conceptually does not differ vastly from the photon. But when gravity is allowed to 
curve spacetime globally, all manner of strange goings-on torment theoretical physicists. 
Is unification a prerequisite for understanding gravity? We don’t know. Historically, 
neither electricity nor magnetism could be fully understood until they were unified. 


Perhaps a greater mystery than either the mystery of gravity or the mystery of the universe 
is the mystery of the quantum. We know how to calculate but not how to interpret. 
We teach and learn the Copenhagen interpretation, but many prefer the many worlds 
interpretation. Yet, to many, the concept of the many many, very many, worlds somehow 
exudes a sleaziness that cannot be expressed in words.!° 

Quantum field theory in curved spacetime is a well-developed subject and leads to 
Hawking radiation, for example, but again I have lingering doubts. In calculating a loop 
diagram for some quantity, say the electron’s magnetic moment at the horizon, are there 
subtleties involving virtual particles propagating inside the horizon and then out again? 
Presumably it is okay over a distance scale on the order of the Compton wavelength of the 
particles involved. 

In the path integral formalism, to calculate the propagator of a particle from point A to 
point B, we are instructed to sum over all classical paths going from A to B. Suppose A 
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and B are both near but outside the horizon of a black hole. Do we sum over the paths in 
which the particle goes inside the horizon and then out again? These paths are forbidden 
by classical physics. 

Here is a toy problem. Consider the quantum mechanics problem of a particle near a 
sharp cliff, that is, a potential of the form V(x) = 0ifx > Oand V(x) = —wifx <0. Weall 
know how to do this using the Schrodinger formalism, but in the path integral formalism, 
do we sum over the classical paths in which the particle falls off and can’t come back (or, 
depending on the initial and final condition, paths in which the particle stays down and 
doesn’t know about life in the upper crust)? One approach would be to regularize, that is, 
to round off the sharp edge of the cliff. Admittedly, this problem does not have the causal 
richness of black hole physics. 

More generally, in doing a quantum gravity path integral sum over all gravitational field 
configurations, are we to include configurations containing black holes or not? The glib 
response is of course, in the same way that when we do the path integral sum for a quantum 
field theory, we are to include the solitons, if any, in the spectrum. But the internal world 
of a black hole is outside “knowable” physics.” 

Many frontier questions involve quantum field theory. Is quantum field theory solid? 
We should think so, at least in the infrared. But yet, as explained in chapter X.8, quantum 
field theory as we understand it does not appear to accord with observation, even if we trust 
it up to merely the electron mass scale, let alone the Planck or string scale. Students often 
think that quantum field theory is a closed subject, on which (too?) many textbooks have 
been written. But ifhistory is a guide, there ought to be a wealth of phenomena in quantum 
field theory yet to be dreamed of, just as there is a wealth of phenomena in quantum field 
theory we now know of that were quite unknown in the 1950s and 1960s. There is the 
additional mystery in the general belief that quantum gravity cannot be a local quantum 
field theory, since quantum gravity does not have any local observables.!® 

Is it possible for Planck’s constant i(M) to depend on the relevant mass or energy 
scale M? Does it make sense to raise this possibility? Why not? After all, 4 is the param- 
eter that controls the proximity of classical and quantum physics. At one time, anybody 
suggesting that the fine structure “constant” a(M) was not constant might also have been 
accused of being crazy (but not crazier). 

It is tempting to blame the woes of quantum gravity on quantum mechanics, as al- 
ready said in chapter X.8. The blame game is certainly inappropriate in many human 
situations; it may also be inappropriate in theoretical physics. Consider the ultraviolet 
catastrophe as an example. Around the start of the 20th century, would you have blamed 
electromagnetism or statistical mechanics?! Place your bet! As it turned out, neither was 
to be blamed. A novel kind of physics had to arrive on the scene. Similarly, perhaps the cos- 
mological constant paradox and the difficulties of quantum gravity are to be blamed neither 
on quantum mechanics nor on gravity, but are crying out for a novel kind of physics. 


The existence of spin 5 fermions is no doubt another deep mystery of physics.2° The 
existence of the representation G; 0) + (0, 5) of the Lorentz group allows the existence of 
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something that becomes the negative of itself upon a 27 rotation, but only in a quantum 
world! Suppose we didn’t know about the electron. Could we have imagined that Nature 
would fill this representation? This suggests another one of our extragalactic fables: in 
some alternative history of physics, the existence of a spin } particle could have been 
another famous prediction of special relativity, like E = mc?. 

If we didn’t know about the electron, would we have known that there is another 
formulation of Einstein gravity using the vielbein instead of the metric? (The answer is 
yes, because Cartan did invent the vielbein without invoking the electron.) 

The fermion puzzles me. Sometimes I feel that the world ought to contain only bose 


fields. Perhaps half integral spins are emergent.! 


Traditionally, particle physicists have focused on the bad high energy behavior of grav- 
ity, but that may not be the real issue, as mentioned in chapter X.8. Furthermore, work 
using the twistor formalism sketched in chapter X.6 indicates that superficially more com- 
plicated theories, like Einstein gravity and Yang-Mills theory, may have better ultraviolet 
behavior than a simple scalar field theory. People have discovered amazing cancellations 


and tantalizing simplifications” 


in calculating amplitudes for the scattering of gravitons. 
One particularly intriguing hint is that amplitudes in Einstein gravity can be regarded as 
the square, or sum of squares, of amplitudes”? in Yang-Mills theory. Not only is Einstein 
gravity deeply geometrical, but also the gauge theories that underlie high energy physics 
may be geometrical, at least much more so than we have appreciated thus far. 

This book adores the action. But, as was made quite stark in the twistor chapter, 
dramatic simplification occurs when, and only when, we restrict the 4-momenta in the 
scattering amplitude to be lightlike, that is, to be on-shell. The action carries a lot of off 
shell information; in other words, using the action, we can calculate quantum amplitudes 
A(P1; P2,***» Pn), with p? taking on arbitrary values that are not necessarily 0. The 
Einstein-Hilbert action certainly does not look like the square of anything. A lot of relevant 
physics might be hidden inside the action.”4 


As alow energy effective theory, Einstein gravity is rather rigid, which is both good and bad. 
In the effective field theory approach, there is little room on either side of the Einstein- 
Hilbert action: the higher dimensional terms are relevant only at short distances, while 
the only lower dimensional term is the mysterious cosmological constant. No room to fool 
around in. To deal with the cosmological constant paradox, we may be compelled to add 
nonlocal terms, and they can readily be designed to account for observations. Alternatively, 
we could abandon Lorentz invariance, thus opening up the gap”° between the Einstein- 
Hilbert and the cosmological constant terms. 
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Theoretical physics as we now know it rests on many pillars: the quantum principle, 
the action principle, locality, causality, Lorentz invariance, general coordinate invariance, 
the gauge principle, and perhaps a few other odds and ends. When people make a list 
like this, it is hard to see why certain things are included and others not. These concepts 
or principles intertwine and are mutually dependent, to some extent. For example, the 
action as usually formulated involves a local Lagrangian and has locality and causality built 
in, and if by the quantum principle, we mean the path integral formalism, the quantum 
principle is dependent on the action. And of course general coordinate invariance and the 
gauge principle are not principles at all but merely a statement that g,,, and A,, contain 
“nonphysical” degrees of freedom. So, in the future, if one of these pillars were to crack and 
fall, which one would it be? Perhaps there are already some tantalizing hints that locality 
might fail. (Some might even argue that string theory is not a local theory but predicts 
locality at low energies.) Indeed, it is causality that we want to preserve, not locality. 

I am particularly respectful of, perhaps even awed by, the action principle. It is truly 
amazing that, while many phenomenological theories cannot be derived from an action, 
all the fundamental interactions we know—gravity, strong, weak, and electromagnetic— 
can be. A priori, there is no transparent reason why all the foundational equations we know 
can be written as the extrema of something. 

Our friends the observational cosmologists have given us much comfort regarding 
whether physics operates the same way throughout the universe. Nevertheless, many 
theoretical physicists might not be bent too much out of shape if some of their cherished 
concepts fail on the cosmological scale, as mentioned in chapters X.7 and X.8. The universe 
may be secretly acausal, but only the universe knows about it. 


In special relativity, young Einstein was able to accomplish what Lorentz and Poincaré 
were not able to do, even though the two established giants had most of it worked out, at 
least mathematically. After all, Lorentz had the Lorentz transformation in all its glory. The 
two older physicists were not able to abandon the perfectly sensible notion that if there 
is a wave, something must be waving. (Incidentally, Maxwell believed in the ether, even 
though his equations did not need it.?°) So they had the ether as a dynamical variable. 
Einstein simply trashed the ether and asserted that nothing could also wave.’” 

Nowadays, any student is able to accept, without blinking twice, that an electromagnetic 
wave consists of A,, waving—yes, just a mathematical symbol A,, known as a field waving. 
Of course, there are energy and momentum densities associated with the wave, and so it 
is real in that sense. 

But what is a field? After spending years writing a textbook on quantum field theory, I 
understand a field as something that does what a field does. No more, no less. A recent 
textbook”8 on electromagnetism asserts that the electromagnetic field is as real as a rhino. 
My response is that a quantum field is as real as a quantum rhino. 
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To move forward, physics had to abandon an apparently ironclad piece of common sense: 
where there is a wave, something must be waving. I would not be at all surprised if it turns 
out that to move forward, we have to abandon an equally ironclad piece of common sense. 

Another reason that the old guards such as Lorentz and Poincaré failed while Einstein 
succeeded was that they tried to derive special relativity from some dynamical theory of 
the electron now mercifully forgotten. The establishment's first reaction to Einstein’s work 
was that the young fellow merely imposed the answer, which he had not derived in any way. 

Einstein curved spacetime. Perhaps the next step is to endow spacetime with substance, 
so that spacetime in neighboring regions can push against one another.”? 


Many have made careers out of worrying about quantum gravity. But classical gravity is 
already plenty puzzling. When we first studied physics, we were told that physics should be 
local, that something happening here can only affect something happening nearby, and for 
a physical effect to propagate across spacetime, a field is needed.*° But the horizon around 
a black hole is a strikingly nonlocal concept. Nothing happens locally. Observers falling in 
do not notice anything. The puzzle is that the Riemann curvature is nice and smooth at 
the horizon and can be made arbitrarily small for massive black holes. But somehow the 
other fields in the world know about the metric g,,,, directly, not about Riemann curvature. 

The horizon is an inherently nonlocal concept.*! By drawing a Penrose diagram, we can 
see that we could be sitting peacefully while an incoming shell of matter far away threatens 
to form a black hole soon, and we could be inside the horizon even before the black hole 
forms. 

Can we possibly modify general relativity so as to avoid having a horizon? Once again, 
apparently not, because a black hole is a low energy phenomenon. Naively, we might also 
think that the addition of local terms would not remove a nonlocal phenomenon like a 
horizon. But perhaps one should still try—it is certainly conceivable, at least to me, that 
the naive view is wrong. 


The founders of quantum field theory wrote profound equations such as A, =0+ A, 
and g =0+4@. Fields execute quantum fluctuation around vanishing classical values. 
But then physicists became more sophisticated in the 1960s and wrote fancier equations 
like g=v+A, with v = (g). The basic equation for the graviton field has the same 
form: gy = Nyy + yy. This naturally suggests that n,,, = (g,») and perhaps some sort 
of spontaneous symmetry breaking. But gravity exhibits a fundamentally new feature: 
8, is a matrix and hence has a signature. Large fluctuations of h,,, can change the 
signature of g,,,, and there could be regions with two times. An obvious thing to write 
down would be a potential for g,,,, (which breaks general coordinate invariance) of the 
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form V(g) = A(8,» — Nv)”, Or more generally, a potential with a deep well pinning g,,,, to 
values close to 7,,,. This induces a graviton mass of order m, ~ /M2, so that A2 is given 
by the product of the largest mass and possibly the smallest mass known to physics. 

This line of thought raises the possibility that the potential V(g) might have minima 
elsewhere. Perhaps there is a phase with g,,, = 0. That could be the ultimate terrorist plot: 
to unleash a g,,, =0 bomb that would annihilate spacetime in the victimized country. 
Compared to this catastrophic transition, heading back into the Big Bang merely causes 
gi; to vanish; the Bang created space but not time. Could the universe have begun at a 
singularity “where” go9 = 0 as well as g;; = 0? Not only no space, but also no time. This 
is not as wild as it may sound; indeed, as mentioned in an endnote in chapter IX.10, 
both Einstein and de Sitter contemplated different versions of this for use as boundary 
conditions. 


In the beginning, the strong interaction was written in terms of nucleons and pions. 
Decades passed before the correct dynamical variables were discovered. Writing the action 
in terms of quarks and gluons rather than nucleons and pions turned out to be the crucial 
step in understanding the strong interaction. It is conceivable that a similar step has to be 
taken for gravity. At the simplest level, we have already seen that the action can be written 
in terms of either the metric or the vielbein. But a more drastic step may be needed, 
and the discovery that the scattering amplitudes for gravitons are equal to the square 
of the scattering amplitudes for gluons may offer a hint. Perhaps the correct dynamical 
variables** have yet to be found. 


Could Einstein gravity be replaced by something more fundamental, which could lead to 
av-8 R effectively at low energy, much as quantum chromodynamics leads to the Yukawa 
pion-nucleon theory? Suppose particle physics experiments had stopped in the mid-1950s. 
Could we have leapt from Yukawa theory to quantum chromodynamics? It is conceivable 
that, by thinking about the proton decay paradox, we could have. This may turn out to shed 
some light on the cosmological constant paradox.? 

The question, stated in the format of an IQ test question, is then “What is to gravity as 
quantum chromodynamics is to pion-nucleon theory?” 

I am not necessarily suggesting here that the graviton is composite. Indeed, a theorem 
by Weinberg and Witten states, with rather general assumptions, that the graviton cannot 
be a bound state. By the way, the AdS/CFT correspondence exemplifies a way around this 
theorem. The gauge theory at the boundary of anti de Sitter spacetime could produce a 


graviton, but only by growing a spatial direction at the same time.** 
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Even if this theorem could be somehow evaded, the possible compositeness of the 
graviton appears to be irrelevant to the cosmological constant paradox. Let the graviton’s 
compositeness be characterized by a*. In quantum field theory, this would be revealed in 
the graviton’s propagator at a characteristic momentum of g* ~ 1/a*. But in the cosmolog- 
ical constant paradox, the relevant momentum carried by the graviton is in what I called 
the extreme ultra infrared, with qcosmological ~ 1/Lcosmological ~ 9, Where L cosmological iS @ 
cosmological distance scale. Presumably, we have Losmological >>>> 4”. In other words, 
the universe could care less if the graviton is composite at an energy scale of, say, 1 Tev. 

It is not compositeness that we are after. For example, in the proposal mentioned in 
appendix 1 of chapter X.7, the graviton propagator is modified at q-osmological by abandoning 
Lorentz invariance rather than by appealing to compositeness. 


Could gravity be part of a larger structure? 

Note that this is a different question from the one asked in the preceding section. We now 
understand the electromagnetic field as part of a larger structure.*° Gravity could be part of 
a larger structure in a mathematical sense, as electromagnetism, based on the gauge group 
U(1), is in fact part of a larger structure based on the gauge group SU(5) or SO(10). The 
larger structure reveals itself only at higher energies. But even if the structure is not seen 
at low energies, it imposes physical consequences. Thus, electric charge is quantized if 
the larger structure is a grand unified theory based on a simple group, and we understand 
why Qetectron = — Qproton exactly, a fact of cosmological significance. There is no way of 
understanding this fact within electromagnetism itself. 

This is an example of an unintended consequence in theoretical physics, in this case 
a consequence of unifying electromagnetism with the strong and the weak interactions. 
Perhaps the answers to some of the questions we are asking also have unintended con- 
sequences. It is conceivable, for example, that unifying gravity with the other three inter- 
actions is possible only if there are three families of quarks and leptons, the existence of 
which poses one of the most puzzling questions in particle physics.*© 

The question, again stated in the format of an IQ test question, is then “What is to gravity 
as grand unified theory is to electromagnetism?” 

In chapter X.7, I mentioned two analogies to the cosmological constant paradox. Here 
is yet another. The history of physics contains a number of logical impasses. One of my 
favorite examples is radiant heat. The leading theory of heat at one time held that matter 
contained a mysterious fluid known as the caloric, but the boring of cannons (see endnote 
1 in chapter VI.4) showed that the amount of caloric in iron appeared to be unlimited. 
The alternative theory held that heat was due to the motion of molecules, which we now 
know is the correct explanation, but this theory suffered from a fatal logical impasse: the 
phenomenon of radiant heat. How could molecular motion be transmitted across empty*” 
space? The logical contradiction seemed insupportable. 
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The paradox was solved by the discovery that radiant heat was a form of electromagnetic 
energy. Perhaps the cosmological constant paradox will be solved in an analogous way: a 
piece of physics is missing from our present understanding. 


Then there is the mystery of time. In our current description, space is created at the Big 
Bang, but not time. In spacetime, space and time are unified, but time, to paraphrase 
what Einstein said about gravity, appears to be first among equals. Various authors** have 
suggested that time may be discrete, but discrete time appears to be conceptually more 
difficult than discrete space, although numerical workers use it routinely.” 

In the AdS/CFT correspondence mentioned in chapter [X.11, from the point of view of 
the physicists living on the boundary, a spatial coordinate appears to emerge (as mentioned 
earlier). A subject of current research is a possible dS/CFT correspondence. If this were 
to be realized, then we would have the intriguing scenario of time emerging. 


In Einstein gravity, the origin of spacetime is intimately linked to the origin of gravity. 
Emergent spacetime has been discussed in a variety of contexts. It has long been known 
in condensed matter physics that various lattice Hamiltonians lead to emergent gauge 
fields*° in the low energy effective theory. And it was speculated that the gauge fields 
responsible for the three nongravitational interactions could all be emergent from an 
underlying lattice system containing only quantum spins.*! Given this background, it is 
natural to speculate that the graviton also emerges from some underlying lattice system.‘ 
But while it is surprisingly easy’ for a gauge field to emerge from a condensed matter 
system, it is very difficult, because of the Weinberg-Witten theorem, for a gravitational 
field to emerge. 


A year before Einstein’s death, John Wheeler asked the old man to speak to a select group 
of students. Besides repeating his opposition to quantum mechanics, Einstein also made 
a cryptic comment: “There is much reason to be attracted to a theory with no space, no 
time. But nobody has any idea how to build it up.”“4 

Perhaps we have to go beyond space and time. But these are just words. As the old man 


said, nobody knows how. 
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Over time, many speculative thoughts about gravity have been thought, some from the 
great, some from the not-so-great, and more than plenty from the cracked. To paraphrase 
the grand old man, future speculations should be as wild as possible—in the way that 
quantum physics would have seemed utterly wild to prequantum physicists (such as 
Maxwell) and that curved spacetime would have seemed utterly wild to prerelativistic 
physicists (such as Newton)—but not any wilder. That is the difference between theoretical 
physics and cracked pottery: the “not any wilder” part. 

This has been a long trek, in spite of what I said at the beginning of chapter VI.1. And 
now I end, with an exhortation to the reader, quoting Henry David Thoreau: “What old 
people say you cannot do, you try and find that you can. Old deeds for old people, and new 


deeds for new.”“° 


Notes 


. F. Dyson, UNESCO lecture, 1965. 
In the same sense as the closing words in my textbook on quantum field theory. 


. String theory satisfies feature 2, inputting the existence of gravity, but fails at feature 3. 


FwneR 


. Scalar fields are fields without qualities, colorless individuals with no character or personality who could 
blend in anywhere. Quantum field theory textbooks start with scalar fields for pedagogical clarity, precisely 
because they are without qualities. 

We could use scalar fields to do practically anything we want. They fit in anywhere. In cosmology, scalar 
fields are used all the time and all over the place. They could drive inflation. They could account for dark 
energy and perhaps even dark matter. Almost anything could be explained with scalar fields. Attempted long 
distance modifications of gravity essentially all amount to adding scalar fields. Scalar fields are way too cheap 
and so so painless. Just throw them in. People get them for free. 

Perhaps we should feel a bit uneasy? I have no objection to composite scalar fields, of course, but then a 
deeper dynamical understanding is called for. 

. Such as the location of the acoustic peaks in the microwave background discussed in chapter VIII.3. 

. Sometimes compared to and contrasted with the standard model of particle physics. 


. I touched upon the coincidence problem in chapter VIII.1. 


ON AM 


. But recall from chapter VIII.3 that there is another coincidence problem that we apparently don’t need to 
worry about: photon decoupling and matter dominance also occurred at roughly the same time. 


9. Sounds a bit like the perfect celestial dome that our predecessors talked about. See chapter VIII.1. 


10. As explained in chapter V.3, observations can only set a lower bound on the universe’s characteristic length 
scale. 

11. As mentioned in appendix 3 of chapter X.7. 

12. See appendix 4 of chapter X.8. 

13. As mentioned in chapter VII.3, for example, and in chapter X.8. 

14. I don’t subscribe to this idea. Even if the measurable effect of quantization is far too small for actual 
measurement, a consistent quantization might require something else, for example, that the dimension 
of spacetime be 4. See endnote 15. 

15. As mentioned in chapter X.8, the two are logically distinct issues. 

16. The Dao that can be expressed in words is not the true Dao. 


17. And, indeed, this assertion is supposed to account for the entropy of black holes. See the discussion of 
entanglement entropy in the literature. 


18. As explained in chapter X.8. 


19. 


20. 


21. 


22. 


23. 
24. 


25. 
26. 
27. 


28. 
29. 


30. 
31. 


32. 


33. 
34. 
35. 
36. 
37. 
38. 
39. 
40. 
41. 


42. 
43. 


AA. 


45. 


46 
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Apparently, Planck did not believe in the equipartition theorem, so that for physicists like him, the ultraviolet 
catastrophe was not a catastrophe at all. 

Jordan’s anticommutation manuscript languished in Born’s pocket for a whole year. As if noncommutation 
were not shocking enough! 

See, for example, A. Zee in Quantum Coherence: 30 Years of Aharonov-Bohm Effect, ed. J. Anandan, Consider 
the effect discovered by ’t Hooft et al. that binding a boson to a magnetic monopole produces a fermion. The 
group theory of SO(10) is also highly suggestive. See QFT Nut, p. 428. 

As alluded to in chapter X.6. 

Appropriately color stripped. 

Fermat’s least time principle does not know that light is a wave. But here the Einstein-Hilbert action certainly 
knows about the graviton and its interaction with other gravitons. Could it be that, analogous to Fermat’s 
principle, there are things that the Einstein-Hilbert action does not know about? 

As described in appendix 1 of chapter X.7. 

So what is the lesson? If your equations do not need it, does it exist? 

This reminds me of an old puzzle. What is greater than God, and if you eat it, you will die? (Sometimes I 
use this to puzzle young children.) 

A. Garg, Electromagnetism in a Nutshell, Princeton University Press, 2012. 

As in the rebellious symphony alluded to in appendix 1 of chapter X.7. 

However, the mysteries of quantum mechanics have also led to entanglement and the Aharonov-Bohm effect. 
But confusingly, while we cannot directly perform local measurements to detect the presence of a horizon, we 
can do so indirectly. By measuring whether light rays tend to converge or diverge, we can detect the presence of 
a trapped surface (or apparent horizon). A sequence of highly plausible theorems (each of which nevertheless 
involves some technical assumptions) by Penrose, Ellis, and others, combined with the unproven cosmic 
censorship conjecture, states that the presence of a trapped surface implies the presence of a horizon. 
Imagine a civilization in some other galaxy that developed a theory of light based on the intensity and 
polarization of light beams. Color could be expressed as a 2-dimensional vector based on something like 
our color wheel. The theory could account for most observations but would eventually be found to be lacking 
when confronted with wave phenomena. Is our theory of gravity an analog of this kind of theory? 

As explained in appendix 1 to chapter X.7. 

See, for example, G. Horowitz and J. Polchinski, in D. Oriti (as cited in chapter X.8, note 2). 

Gerard ’t Hooft has given an elegant expression for the Maxwell field F,,,, in terms of the Yang-Mills field 
F*,,,. Is there an analog for gravity? Can g,,, be written in terms of some more elaborate object Gay? 

In the dream of the ultimate theory, everything will be fixed, not just the fact that there are three families. 
That would be the final response to the anthropic alternative to physics. 

Iam not enough of a historian to know whether attempts were made to measure the transfer of radiant heat 
across a chamber with its air pumped out. 

Including T. D. Lee and G. ’t Hooft. 

As explained in our discussion of the initial value problem in chapter VI.6. 

There is an extensive literature starting from the late 1980s with the discovery of fractional quantum Hall 
fluids and of high temperature superconductivity. 

For one particular example, see A. Zee, “Emergence of Spinor from Flux and Lattice Hopping,” in M. A. B. 
Bég Memorial Volume, ed. A. Ali and P. Hoodbhoy, World Scientific, 1990. 

See especially the work of X. G. Wen and his collaborators. 

For example, write the spin field (a unit vector 7) as n(x) = z1(x)oz(x) in terms of a spinor field z(x) and 
the Pauli matrices o (as introduced in chapter X.6). The local symmetry z(x) > e!°z(x) leads naturally 
to a gauge potential. See, for example, X. G. Wen and A. Zee, “Possible T and P Breaking Vacua of O(3) 
Non-Linear o-Model and Spin Charge Separation,” Phys. Rev. Lett. 63 (1989), p. 461. 

T. Damour, O. Darrigol, B. Duplantier, and V. Rivasseau, eds., Einstein, 1905-2005, Poincaré Seminar 2005, 
Birkhauser, 2006, p. 174. 

The proposal mentioned in appendix 1 of chapter X.7 might conceivably be a first tentative step in this 
direction. 

. H. D. Thoreau, Walden. 


Timeline of Some of the People Mentioned 


Galileo Galilei (1564-1642) 

René Descartes (1596-1650) 

Pierre Fermat (1601 or 07/08?-1665) 

Isaac Newton (1643-1727 [1642-1726)]) 
Gottfried Leibniz (1646-1716) 

Johann Bernoulli (1677-1748) 

Leonhard Euler (1707-1783) 

Jean-Baptiste le Rond d’Alembert (1717-1783) 
John Michell (1724-1793) 

Joseph-Louis Lagrange (1736-1813) 
D’Amondans Charles de Tinseau (1748-1822) 
Pierre-Simon, Marquis de Laplace (1749-1827) 
Carl Friedrich Gauss (1777-1855) 

Friedrich Wilhelm Bessel (1784-1846) 
Urbain Jean Joseph Le Verrier (1811-1877) 
Jean Frenet (1816-1900) 

Joseph Serret (1819-1885) 

Georg Friedrich Bernhard Riemann (1826-1866) 
Elwin Bruno Christoffel (1829-1900) 

Julius Weingarten (1836-1910) 

Marius Sophus Lie (1842-1899) 
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Wilhelm Killing (1847-1923) 
Woldemar Voigt (1850-1919) 
Gregorio Ricci-Curbastro (1853-1925) 
Hendrik Antoon Lorentz (1853-1928) 
Jules Henri Poincaré (1854-1912) 
Luigi Bianchi (1856-1928) 

Max Planck (1858-1947) 

David Hilbert (1862-1943) 

Hermann Minkowski (1864-1909) 
Elie Cartan (1869-1951) 

Gustav Mie (1869-1957) 

Willem de Sitter (1872-1934) 

Karl Schwarzschild (1873-1916) 

Max Abraham (1875-1922) 

Albert Einstein (1879-1955) 

Gunnar Nordstrom (1881-1923) 
Amalie Emmy Noether (1882-1935) 
Arthur Eddington (1882-1944) 
Hermann Weyl (1885-1955) 

Theodor Katuza (1885-1954) 
Johannes Droste (1886-1963) 
Alexander Alexandrovich Friedman (1888-1925) 
Attilio Palatini (1889-1949) 

Kornel Lanczos (1893-1974) 

Georges Lemaitre (1894-1966) 

Oskar Klein (1894-1977) 

George Szekeres (1911-2005) 
Richard Feynman (1918-1988) 


Solutions to Selected Exercises 


In the book of life, the answers aren’t in the back. 


—Charles M. Schulz, speaking through 
Charlie Brown 


Prologue 


1 With x and L as labeled in figure 1 (with sand replaced by air), the time getting from F to G is given 
by T= cpt x? + AZ } fy (L — x)? + B?. (Since the math involved is high school level, I won't even 
bother to define A and B.) Setting the derivative of T with respect to x to 0, we obtain c,,x/V/x? + A2 = 
Cq(L — x)/V(L — x)? + B*, which we recognize as c,, sin 0, = Cq sin 0,,. 


l.1 Newton’s Laws 


1 The first part is obvious since 5(x) is sharply spiked at x = 0, so that in the integrand we can replace f (x) 
by f(0) and then do the integral. In the second part, change variable to y = ax and note that the limits of 
integration depend on the sign of a. 


- “ 4 
2 Write r’ = a =— su’, so that the equation (4 = 4 (e — v(r)) becomes 
2k 2e 
2 2 as 
uw + ue — rr = 


You recognize this as just the shifted harmonic oscillator, which you solve instantly as 
Pye £14 ecos6) 
r 2 


with the eccentricity e given by e? =1+ cae That the orbit closes is now obvious. 


3 We could still use (19) except that the root rj, is now negative, which is not physical since u = 1/r > 
0. A moment’s thought indicates that the lower limit for the u integral in (19) should be set to 0. 


Changing integration variable as before, we obtain Ad = 4 /, : dt, with Cpin determined by sin* Cyin = 
“min _ — 1 7 . First, let’s check that we are on the right track by turning off gravity: set x = 0, 


Umax“ min 2 Qn/ 2612-42 


then Cmin = 7/4 and A@ = 7, the correct answer for light moving in a straight line. Next, expanding to 
leading order in x, we obtain Ad = 7 + —. We now express the deflection angle (as usually understood, 


12° 
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2k 
1/2 
by saying that as x — oo, the light ray moves along a path specified by y = b. Translating into polar 


coordinates, we have, as r > 00, from (9) b = r6, from (18) #2 ~ 2c and? ~ —4/2e. Using (15) and # = £ 


do ~ 6” 
we determine / = b./2e. Thus, AO = oR: 


Newton of course did not know what the speed of light was, but if we set 7? = ¢? so that € = c”/2, we 
find the Newtonian result for the deflection of starlight by the sun 


that is, straight line corresponds to no deflection) A@ = in terms of the impact parameter b defined 


_ 2GM 


A@ 
c2b 


Any spherical mass distribution can be built up by stacking up spherical shells. The potential at 
(x =0, y=0, z= R) due to a single Newtonian shell of radius a, thickness 5, and mass density p is 
then given by 


2 : 1 
V(R)= Gpa‘dad®é sin 6dg = an Gpa°s | du 


R2 + a2 — 2Ra cos 6 -1 J R2+ a2 — 2Rau 


= 2n(Gpa/R)3 [cr + a? + 2Ra)¥? — (R24 a2 2Ra)"?| 


Outside the shell, R > a and the square bracket evaluates to (R + a) — (R — a) = 2a and so V(R) outside = 
Gp (4ma5)/R = GMgher,/R, the first superb theorem. Inside the shell, R <a and the square bracket 
evaluates to (R + a) — (a — R) =2R and so V(R) inside = Go (4745), which means the shell exerts no 
force on an observer inside—he is tugged in all directions—the second superb theorem. 


Rotation: Invariance and Infinitesimal Transformation 


Intuitively, it should be obvious, since f dxdy5(x)8(y) f(x, y) = f (0, 0) just picks out the value of the 
function f at the origin. More formally, we have 


5(x')d(y’) = 8(cos 6 x + sin @ y)d(— sin @ x + cos@ y) 


2 
0 
= d(cosdx + sind y)d (- sin 6 x — oe x) 
sin 6 


1 1 
= d(cosdx + sind y)d ( - x) = d(sin 6 y)d ( - | = 6(x)d(y) 
sin 0 sin 6 
where the second equality follows since the first delta function forces y = — cose x. This result can be 
generalized to any dimension. For example, in 3-dimensional space, 5(x’)5(y’)5(z’) = 5(x)5(y)4(z), a 
result we will use in chapter II.1. 


We could either perform the integral after writing down the components of p explicitly, or argue by 
rotational invariance that the integral must be proportional to 5’). The proportionality constant could be 
then determined by contracting with 5 (using the repeated index summation convention). 


Who Is Afraid of Tensors? 


Write £=7 x 74+ C(r)F with C(r) = £ and differentiate: Zz =IxF+C(r)F+C (iF. Use r= — Si, 


r2 =72, so that ri =7 -7, and the identity derived in the text, so that Ixfarr— (F- rr. From here a 
few lines of arithmetic lead to L = 0.Iam dealing with the Laplace-Runge-Lenz vector per unit mass here. 


Of course, if you like, you can multiply everything by m and write p = mr. 


SAU SHAT S‘JAU = 0, since something equal to its own negative has to vanish. Note that the 
second equality follows from relabeling the dummy summation indices. 


10 


16 


7 
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Cyclically permute the definition of H: 
kil = GH 4 GH 
Hisk = Gir-k 4 Giles 
Hie = Gikei 4 Giik 


Add the first two lines and subtract the third. Then G‘’/ = 5 (HR + Hesk — Ask), 


For example, for D = 2, let us evaluate e/ R'?R/4 for p= 1, q = 2. We have ed RORI2 = el Ri p22 4 
e2lp2ipi2 = 6!2( R11 R22 = R21R12) = el2 det R. 


From Change of Coordinates to Curved Spaces 


We have ds? = Syy(x + dx)(—dx")(—dx”) = g,,,(x)dx"dx" +--+, with the dots indicating higher order 
terms. 


As explained in appendix 2, this space is just E>. Transform coordinates by x = Vr? + a? sin 6 cos 9, 
y=vr?+a’*sin@ sing, z=r cos @. Note that r = 0 represents a disk of radius a in the (x-y) plane. The 
surfaces of constant r are ellipsoids, and the lines of fixed 6 and g are hyperbolas. 


+1 and —1, respectively. 


This follows immediately from the result of exercise 9 with 0, renamed 6. We can also see this more 
geometrically by noting that the defining equation for S¢, namely (X1)? + (X*)? + «+--+ (X41)? =1, 
may be written as (X!)? + (X*)? +--+ (x4)? =1- (X4+})? that is, as made of a collection of S¢~! with 


radius \/1— (X4+1)2 as X4+1 ranges from —1 to +1. 


Let the torus be formed out of a flexible cylindrical tube of radius a and length 27 L. It is embedded in E? 
according to X = (L+asin@) cosy, Y=(L+asin6) sin g, Z =a cos 6. Note that the two coordinates 
6 and y on the torus run from 0 to 27, with 6 winding around the tube and g running around the “hole” 
of the torus. Then ds? = dX? + dY? + dZ* =a7dé* + (L+asin6)*dg?. 


Given ds? = Adu? + Bdv* + 2Cdudv, with A, B, C functions of u, v. Let u = f(x, y), v= g(x, y), with 
two unknown functions f and g, so that 


du= f,dx + fydy, dv = g,dx + gydy 


Plugging in, we have 


ds* = A(f,dx + fydy) + B(g,dx + gydy) + 2C(frdx + fydy)(grdx + gydy) 


Collecting terms and setting the coefficient of dxdy to 0 and the coefficients of dx? and of dy? equal to 
each other, we obtain two equations that we can solve for fy and gy. In other words, we have two equations 
giving 0, f and a, f in terms of f,, g, and A, B, C. Now think of this as an initial value problem with 
y playing the role of time. Let us specify the two functions f(x, yo) and g(x, yo) at some initial time 
Yo. Our two equations tell us what 0, f and @,f are, which allows us to determine the two functions 
f (x, yo + Sy) and g(x, yo + dy) at some infinitesimally later time yg + dy. In other words, we can integrate 
to obtain the unknown functions f(x, y) and g(x, y), at least within some local region. (Of course, in the 
integration, the functions A, B, C are to be treated as functions of x, y, that is, A= A(f(x, y), g(x, y)) 
and so forth.) Thus, within some coordinate patch, the metric can be written in the conformally flat form 
ds* = Q?(x, y)(dx? + dy’). 
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Change coordinates by wu = logr, v = 6, and we obtain ds* = (dr? + r?d6?). The space is just the plane 
in disguise. 


Curved Spaces: Gauss and Riemann 


With no loss of generality, we can pick the point P to be (x, y) = (0, y,). A particular set of locally flat 
coordinates (u, v) is given byx = y,(u+uv+---),y=y,(v4 5 (v? u2) +--+), where the dots represent 
terms cubic and higher in w, v. We could of course trivially rotate (u, v) (and also translate) to obtain 
another set of locally flat coordinates. 


From gj, = (1+ u2u), we have By1,22 = 0. Similarly, By 1, = 0. From By 42 = FHV. So 2By 12 — Bi,22 — 
By7,11 = LV, which is indeed the intrinsic curvature. 


p-1\ 2 
We have ds* = dx? + dz?/(pz'7 ) . For example, for z = y?, g-, = 1/(4z) blows up at z = 0. 


Let one line segment go from the point x to x + (Ax),, and the other to x + (Ax). The angle between 
the two line segments is given by 


cos 6 = (Ax); - (Ax)2/¥ (Ax); (Ax)1)((Ax)2 - (Ax)2) 


(defining (Ax), - (Ax), = Suv(Ax)t (Ax)} and generalizing the standard high school formula for the 
scalar dot product). Suppose we now calculate the angle with the metric g: the factors of Q evidently 
cancel between the numerator and the denominator. 


Solution by counting. We can choose D functions, but we have to satisfy 3D(D + 1) — 1conditions. We 
have enough freedom for D = 2, but not for D > 2. 


— x+y 
R= 2x2y2 


Differential Geometry Made Easy, But Not Any Easier! 


We calculate the components of 4 along the two basis vectors, namely V- én()), and then use these 
two components to form a linear combination of the two basis vectors. These words translate into an 
expression for V projected into the tangent plane: Vp(y) = (V Ce (y))g""(y)é,(y), where g”(y) is the 


inverse of the 2-by-2 matrix g,,,(y) defined by g“"(y)g,,(y) = 5,.. (For the sphere, g,,) = ( 5 is , ) and 
gh = ( 1 f, Me ; ) To see that the inverse is needed here, take the dot product of Vp(y) and a basic vector 
e,(y): 

Voy) 2) =V EO 8"" EW) BO) = (V 8,008" Man) = (V EQ) = (V -E,)) 
In other words, (Vp(y) — V) - &,(y) = 0, which is just another way of saying that Vo(y) and V differ bya 


vector normal to the surface. In other words, we subtracted out the component of V normal to the surface 
from V to obtain Vp(y). 


Multiply the two eigenvalue equations (for i = 1, 2) 


(Ky 3° KiSuv ty =0 


by e (with 7 4) and subtract one from the other. We obtain (k, — k)g uolye t) = 0. For those readers who 
have studied quantum mechanics, does this remind you of the proof of wave function orthogonality? 


1 


1.2 
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Take the dot product: é, -é,, 
IC Ey) = 9,8 ou = r, suv + Ty.pv» Which we can solve for P using 1.4.15. 


= eee =T,.,»- Interchange p <+ yw: é,,-é, , =Ty..py- Add to obtain 


The Hanging String and Variational Calculus 


Once again, we can solve V2 = GM5®) (x) by dimensional analysis. It is also easy to verify the solution 
explicitly. By rotational invariance, ® can only depend on r? = a , x'x!. Differentiate this to obtain 


Ske t i 
rdr= ae x'dx', so that * = *. Then we have 


>. a >. a px! p(p+2—D) 
ae oo, ppt2 a ppt2 


= x! Ox! rP xt 
i= 


We see that ® « 1/r? with p = D — 2 solves the equation for X away from the origin. The potential goes 
like 1/r?~?, and so the force law is an inverse (D — 1) law. 


Varying S with respect to b gives a’ = 0, and varying it with respect to a gives (r(1— b))' = 0. Fitting to 
the boundary conditions at spatial infinity gives a = 1 and b = 1-— 2M 


Describe the desired curve by y(x), with y the vertical axis. Released at rest at y = 0, the bead attains a 
speed of v(y) = /2gy after falling a distance of y (with the coordinate y chosen to point downward). We 
see that the transit time (in suitable units) is given by 


rf F2- feo] 


You could proceed from here, derive the Euler-Lagrange equation, solve for y(x), and “rival” Newton more 
than 300 years later. Note that in spite of my remark in the text, in this context, the notation y(x) seems 
quite natural. 

Here is the instructive part of the problem. We could just as well have chosen y as the variable and 
solved for x(y). Then 


1 
2\ 72 
1 d 
= / dy 14 ( Z ) 
y dy 
You can verify that, in contrast to the case with the previous choice, the second order differential equation 
can now be integrated trivially to yield the first order differential equation 


(S) == 
dy yey 


with y* an integration constant. You could solve this easily with the change of variable y = y* sin? 6. 


Moral of the story: in solving variational problems, it pays to choose the independent variable wisely. 


The Shortest Distance between Two Points 


dX" dx” 
dr dr 


Varying the stated quantity, we obtain < (280 a) = (95 Suv) almost instantly, but this is just 


(16). 


Simply plug g,,. = e7%5,,, into (24) to obtain 4 (e?%5,,,V“) = 9,2 (since g,,,V“V" = 1). Noting that 
d 22 


Te =2V"d, Qe, we find, after cleaning up a bit, 
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11.3 


Hl.1 


11.3 


my 
a +2(V'A,2) V*=a'Q 


where 4* = 9*74,Q. 


Physics Is Where the Action Is 


6S =2fdxtdx~ (PE fe 4 Bh BE) 4 fdxtdx St 54 


axt ax~ ax? Ox~ xt Ox 


Galileo versus Maxwell 


For convenience, write the velocity of the incoming ball as 2v. In the center of mass frame, one ball has 
velocity v, the other —v. After the collision, one ball has velocity u, the other —u#. Energy conservation 
gives iu” = v2. In the lab frame, after the collision, one ball has velocity 0 + u, the other t — u. Since 
(6 +i) -(U — a) =i? — i? =O, the angle between the velocities of the two balls is 90°. 

Of course, the problem is so elementary that we can also easily do it in the lab frame. Momentum and 
energy conservation give J = i, + ii), 0? = ut + W3. Squaring the first equation and comparing with the 
second yields iu - 17 = 0 immediately. 


Minkowski and the Geometry of Spacetime 


This is given in the preceding chapter. 


Differentiate NuvV4¥V" = 1 ftuyVEV" =0=2n,,a"V" =a,V”. In the rest frame of the particle 
Vv = (1, 0) and hence a® = 0. 


We can choose V to point in the 1-direction and x? =x? =0. Solving the three equations V,V" = 


: 0 1 wk . 
-1, a,V" =0, and a,,a" = 8, we obtain a? = av =gV! and al= ae = gV®, giving the solution 


t(t) = x(t) =g-!sinh gt, x(t) = x(t) = g~! cosh gt. Thus, x”(r) traces out a hyperbola x(t)* = 
t(t)? + g~2. We have V" = (cosh gt, sinh gt, 0, 0) and a” = g(sinh gt, cosh gt, 0, 0), which satisfy all 
the stated equations. Note that with some suitable adjustments, this shows that for fixed p, this coordinate 
transformation amounts to a transformation to the frame of an accelerating observer, with T = gt and 
p=g. 


As explained in the text, we have FY” > F/#¥ = AK AY F°°. Hence, FuH AO AL Ree and FJ = 
Ai AJ F°°. For example, for a boost along the 1-axis, A is given explicitly in the text, and you merely 
have to write out the repeated index sums. 


Denote by A(x, ¢) a boost in the x direction by the rapidity parameter y, and so on and so forth. Then, 
with the abbreviation c = cosh gy, s = sinh g, c' = cosh’, s’ = sinh g’, we have 


c s 0 c (0 Os’ cel os cs! 
Aw, g”AY,9)=|]s ¢ 0 0 10 4=] sc css’ (29) 
001 s 0 ce sy Oc 


Next we follow what we did in appendix 2 to chapter I.3. Compare A(y, g’)A(x, g) with A(x, g) A(y, 9’) 
by calculating 
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(A(y, PA, 9) TAG, DAY, g)XI+99'10 0 1 
—1 0 


=I+ e¢'[Ky; Ky|+ Boia 
=1—igg'I, (30) 


Here we used the symmetry of A, so that A(y, g')A(x, g) = (A(x, 9) A(y, 9’))", and thus in computing 
the left hand side of (30) to the order indicated, we need only keep the ss’ term in (29). We have thus 
verified (24). 


Ill.5 ©The Worldline Action and the Unification of Material 
Particles with Light 


1 As in (15), we vary with respect to the auxiliary variable y,g, which we then eliminate. We would like to vary 
S with respect to yg. Using a matrix identity we have used again and again, we have by = — SV enV np 
and dy = VV PS yup. For ease of writing, define hyg = 0,X"4gX,,. The variation of the integrand in (21) 
thus gives 6[y 3 YP hag] =y 3 [5 YS Yen (VP Aap) — ¥ SY ony Nap: Setting the coefficient of 5y,,, to 0, we 
obtain 


hen = 5 Ven(V? ap) 


where the indices on h are raised and lowered by the metric y. Multiplying this equation by h”* (and 
summing over repeated indices), we find y*?h,g = 2 and thus y,, =h,,,. Plugging this into (21), we find 


that S = 37 f dtdo (det h) 22. Thus, S and SnNambu-Goto are indeed equivalent in the sense that they lead 
to the same equation of motion. 


111.6 Completion, Promotion, and the Nature of the Gravitational Field 


1 Consider a head-on collision p +k — p’+k’ with p= (E, 0, 0, p) (note the trivial abuse of notation 

here), k = w(1, 0, 0, —1), and k’ = (1, 0, sin 6, cos 9). Minkowski squaring p’ = p + k — k’ gives us, 
o(E+p) 

E+o— (P- @) cos 0” 


= J/E2—m2~E me , and we obtain the stated result. 


fora <p, a= which is maximized when cos 6 = 1. For a highly relativistic particle, 


2 The identity iss +r+u= 0, m2. 


3 In calculating 4,,7"", we see that the first few steps are oF same as in calculating 0,,n" as given in the 


text. The reason is that, during these steps, the quantity in (7) is just going along for the ride. We 


arrive at 
a,7%" = = 23 / dT Mq qq d aoe da(Ta)) = — Mg = zo da(Ta)) = 
dt, d 
. es 
upon using the equation of motion —¢ = 0. 


4 T%x) = (0+ P) hy, TH(x) = (p + P) ERS + PSY 


5 The number of components is given by jd (d — 1), which for d = 4 is equal to 6. First, F® is a 3-vector. 
We are then left with the 3 components F'/ = — F/". Form the 3-vector e!/' F/*, 
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IV.2 Electromagnetism Goes Live 


4 We have 
or 49 ety pyae 82 gp. OF 
° 2 at ar ar 
=-E-J+E-VxB-B-VXE 


As you may know from a course on electromagnetism, the vector E x B is known as the Poynting vector 
and measures the momentum flow in an .Slectroragneuc field. Note that the equation we just derived is 


the v = 0 component of the equation 4, Ta timagisie = —J,,F”* we derived in exercise 2b. The relativistic 


notation is far more compact! 


5 Using the antisymmetry of the € symbol and €9173 = —1, we have Fy; = —€173F2? = B', after referring 
to (IV.1.17). Next, Fy) = —€993F? = E?. Comparing with (IV.1.17), we see that going from F,,, to the 
dual tensor F,,,,, we exchange the roles of the electric and the magnetic fields up to a sign: E > "8 and 


bv? 
BoE. 


7 Simply plug the identity in the preceding exercise into the expression for the energy momentum tensor 
of the electromagnetic field. 


8 The two invariants under Lorentz transformation with the stated property are F,,,F"” and F, av wv We 
have already encountered the first one in exercise 1. Up to an overall constant, the second invariant is 
equal to E - B. 

10 As in the derivation of the virial theorem in classical mechanics, we want to take the time average of various 


quantities. Define (A) = > i dtA for T large. Note that, provided that a time dependent quantity B(T) 
remains bounded, ( ap j= t (B(T) — B(O)) > 0 for T large. This is where the assumption that the motion 
of the particles is confined to a finite region comes in. 


Let T= Te 4 THY _, as in exercise 2b. Time averaging the conservation law 3,7’ = 
particles electromagnetic Bb 


oT) + 9,74 = 0, we obtain d,(T/) = 0. Therefore, f d?xx/9,(TV) =0=— f d3x(T"). In the last step, 
we integrated by parts. Here repeated indices are summed. 
Next, using the result of exercise 9 and the expression for T 


Titles n (III.6.7), we have 


Ts NyuvT i = Nw, particles = eS / dT, m oe — Yq (Tq)) 


=— Somali - 8, 8 - Gata) 


where f, is understood as the solution of ¢°(t,) = x° =. For the last equality, we used (III.6.11) and the 
discussion that follows it to integrate over t, 
Finally, putting things together, we obein 


- f ex) =Som (y1-8,)= [os (r®— 7") = f b(t) = 6 


where E denotes the total energy of the system. Thus, we obtain the relativistic virial theorem 


e=Dm(Vt- Re) 


In the nonrelativistic limit, we recover the usual virial theorem E — 7, mg =—} Dig Malvy_,). While 
the right hand side is equal to minus the time averaged kinetic energy (K), the left hand side is the total 
nonrelativistic energy, which we can write as the time averaged kinetic plus potential energy (K + V), 
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since K + V is conserved. We thus recognize the more familiar nonrelativistic form of the virial theorem 


As the total energy decreases, the kinetic energy, that is, the temperature, increases. 


Prologue to Book Two: The Happiest Thought 


V.1 


V.2 


V.3 


First of all, we have to understand how a burning candle works normally. The hot gas produced by the 
burning candle, being less dense than air, rushes upward. The upward rush of the glowing gas is what we 
see as the flame. The candle is thus assured of a steady supply of oxygen from the ambient air as the gas 
rushes out of the way. The second point is that the upward rush of the gas can be better interpreted as due 
to gravity pulling the denser air down. By moving downward, the ambient air is actually displacing the 
gas upward. The falling candle feels no gravity, and neither does the air around it. The hot gas expands 
outward rather than rushing upward out of the way. For a moment, the candle is deprived of air supply 
and goes out. Watch this on the web! (http://www.youtube.com/watch?v=NIBp21fqguU) 


Spacetime Becomes Curved 


The helium in the balloon does not know, momentarily, that the car has stopped and tries to continue 
its forward motion. But the air in the car is trying to do the same. When the air reaches the front part 
of the interior of the car, it flows back. Since the density of air is higher than the density of helium, it 
pushes the helium balloon back. Unlike other massive objects in the car, such as the driver and the 
passengers, the helium balloon jerks backward rather than forward. 


The Power of the Equivalence Principle 


Let R = earth radius, w = earth’s angular velocity, and h = altitude of plane. With dr = 0 and d6 = 0, the 
proper time interval is given by 


2 
dv= (1 ae) dt? —rd¢? = ( pis igrome (#)) dt? 
r r 


dt 
Since ay x Report we have 
M 1 
dt= (1 o = (Ro+v4 hoy?) dt 
R+h 2 


with the proper time interval dt, of the clock on the ground given by this expression with v and h set to 


0. The fractional shift is thus equal to 


dt — dt, 
gw seid Ro 12 
dt R2 2 


g 
We find that the fractional shift between the eastward flying clock and the westward clock is Agy = 


dte-dtw ~ 4,2 
Pleo a 


The Universe as a Curved Spacetime 


With r = L sin w, for example, we have Ldy = dr 


2 
r 
1-5 
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6 


V.4 


V.6 


Do I have to teach you how to integrate? 


dr 


The trajectory of the light ray is determined by a = =, and thus the proper time interval Arp 


ks 


between the two pulses is now given by 


iE dr ‘B dt ea dt 
0 f1-ee Sts alt) Siszars a) 
The derivation then proceeds as in the text. We don’t care about the r integral, only the equality between 


the two t integrals. 


Motion in Curved Spacetime 


We obtain rv. = A + & +2 and’, cot6, and we verify that My = +d, g@ is satisfied, since 


i Ov Nia 
g=r‘AB sin? 0. 
With A = 1and 6, g constant, « = 1, r constant solves the radial equations of motion. 
A plot of v(r) shows that for r > 2GM, particles “fall” toward larger r. 


Plugging g,,)(x) = (x) guv(X) into the expression for the Christoffel symbol, we obtain 


ipetemnd wad + (54d, + bya, 8.8"? 95) log Q 


: : . : : 2 = wre : : 
Thus, in the “twiddle” spacetime, the geodesic equation 4 x + ft cd" = 0 is manifestly not the 


va dt dt 
same as the geodesic equation ax" +7 ax “ ax" = 0 in the “nontwiddle” spacetime. 

However, for a massless particle, if X” describes a trajectory so that g,,,(X)dX“dX" = 0 according to 
(19), then clearly g,,,(X)dX"dx" = Q(X) 8 ,y(X)dX" dX” =0. 

It is also instructive to show that the geodesic equation (20) holds in both spacetimes. Suppose that it 
holds in the “twiddle” spacetime. Then we have 


PX" | ay dX" dX* _ d?XH |, dX” dX* | dX" dX” 


a a, log 2=0 
doz’ “de de dt | “de dt de de® 


where we used dX - dX = 0 for a massless particle. This does not look like the geodesic equation in the 
“nontwiddle” spacetime. 
But suppose we write ¢(7). Then we have 


dX" ~— dn dX" ana d*X" dnd (2) (2) _ @y dX" 
dé dt dy dt2 dt dn \de dyn) \de) dn* * de? dn 


Now note that the last term on the left hand side is equal to 


in 2 XE 
5dXx" d log 2 =2 (2) dX (2/2) 
dg dg dg} dyn \dn 


(Here Q is evaluated on the trajectory of the particle and hence may be regarded as either a function of 


¢ or n.) Thus, by choosing n(¢) to satisfy ot = zw , we can knock off this unwanted last term and obtain 
a?xt M dX” dX* _ 
dn* + Pa dn dn =0. 


Covariant Differentiation 


Varying the action 


VI.1 
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S= / d*x J=g (- 58?” Fag Ay + Ay) 
we obtain 

an (V=88"'8”" Fag) = —V=8 J" 
that is, 


1 
DF = a, (/—g FY") =—J” 
KL J=8 tas 
Compare with (IV.2.13). 


Note that in curved spacetime, Fo, and F,, are no longer equal to —F°! and F", respectively. Our 
convention is to define E = (E!, E”, E+) and B = (B!, B?, B>) by E! = — Fo, and B} = Fy) and their 
cyclic analogs. For ds* = dt? — a(t)*dx*, —g =a®, g© =1, g!!=1/a?, andsoS = 3 f dtd?x(a(t) E2 — 


1 p2 
aw B’)- 


To Einstein’s Field Equation, as Quickly as Possible 


cos 6 


Sng into the 


Simply plug in the Christoffel symbols for the sphere Lee = — sin cos 6 and Toa = 
expressions D, W,, = 0,W,, — Py, Wo and D,U“ = a,U" + TU" to obtain 


DgWo =%We,  DgW, = 99W, — 7 , 

Dg Wo = 9gWe — ae DyW,y = dW, + sin 0 cos OW, 
and 

D,U® = 0,U°, Diu = Ue + Pye, 

D,U° =4,U° — sin@ cosdU®, = D,U® = 9,U% + eeu 


The suggested check is then easily done. 


In calculating the left hand side of (5), [D,,, D,]S,, we could choose p to be either @ or g. I will do one 
case and let you do the other. We have 


cos 0 cos 0 
Dy Do So = 99(DoS, D,S, DoS, 
Pore (Doo) sng °° sino °? 
2 2 
cos 6 cos 0 cos 0 cos 0 
= 0,098, d0So 4 S, 09S, 4 S 
rere sing ° % (3) ° sing °? (=) 3 
and 
DpDySp = 99(Dy Sp) - 2 n, 5 
6 0r0 — (6\*0r06 sin 6 (ed 
2: 
cos 6 cos 0 cos 0 
= 09(0,S, a) S, So 4 S, 
(2pSo) (So .) sing °° (=) 7 


Subtracting, we find [Dg, D,|Sg = —S,. Equating this to RG g5o° and invoking the symmetry property 
of the curvature tensor and the diagonality of the metric, we obtain R%, oo = 1. From this Rog = 1 follows. 


p — off — 96 6 6 See? ee 
We also have R 990 = 8° Rooye = 8°? So0R ‘obp? and so R poo = Sin 6. Hence we have Ry, = sin* 0. 


Finally, we obtain R = g9° Rog 4 gf? Rog = 2, as expected. 
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Vi.2 


In 2 dimensions, (14) simplifies to the expression stated. If we are so fortunate that we are already in locally 
flat coordinates, we can read off the single component of the Riemann curvature tensor. Let us check the 
example in (1.6.2): we read off By1,9) =, Byz,12 = 3 (ab +c”), By 4, =’, and so R491) = ab — c’, in 
complete agreement with (1.6.3). 


The first one is flat (the transformation from Cartesian coordinates is x =u +02, y=v—u?*/2). The 
second has a scalar curvature given by —4v/(1+ 4uv — 2v? + 2u*v*)?. 


The antisymmetry in (10) and (15) implies that the indices A and B in the Petrov notation can each 
take on 3d(d — 1) values. Next, (16) implies that the matrix Rj, is symmetric, and thus contains 
3{zd(d _ DH zd 1) + 1} =d(d — 1)(d? — d + 2)/8 independent components. Finally, after imposing 
the constraints from exercise 3, we are left with d(d — 1)(d? — d + 2)/8 — d(d — 1)(d — 2)(d — 3)/24= 
d?(d* — 1)/12, in agreement with chapter I.6. 


Equating the number of independent components in the Riemann curvature tensor d?(d* — 1)/12 to 
the number of independent components in the Ricci curvature tensor d(d + 1)/2, we obtain the cubic 
a —7d —6=(d — 3)(d+ 1)(d+ 2) =0. Thus, for d = 3, the two tensors have the same number of 
components. 


Plugging g,,, = 2g,,, into the definition of the Christoffel symbol, we obtain the first equality in (26) 
immediately as a result of the product rule of differentiation. Writing it schematically as F ~ T + Q710Q 
and plugging it into the definition R’,, ~ 97 + TT, we have R*, ~ AF +7 + 2-190N 4+ TAQ + 
2-19QQ-19Q. Convincing ourselves that the third and fourth terms combine into Q-1DdQ (as they 
must and as we can readily verify by keeping track of the indices for one specific combination), we obtain 
schematically R'. ~ R’,, + Q7'1DIQ + Q-24QAQ. Once we realize this, we can simplify the rest of the 
calculation drastically by letting g,,, be the flat metric (that is, 7,,, for spacetime or 5,,, for space), so 
that the problem reduces to that of calculating the Riemann curvature tensor for the metric g,,,, = Qn a 
collecting the two sets of terms of the forms 99Q and 9Qd0Q. Once we have that result, we can then 
promote 7, to g,,,, and so on, to obtain Reng. 


To Cosmology as Quickly as Possible 


Calculate the Christoffel symbol, then the Ricci tensor. For example, 


Ti = p/t. Ry =(—pe t+ p-@tqa-rtr/? and Ry =p(ptq+r—)PP-Y 


(By the way, we will calculate the curvature for the Kasner universe using differential forms in chap- 
ter IX.8.) Einstein’s equations R,,, = 0 are solved for p+ q+r= p?+q?+r?=1. These 2 equations 
could of course be immediately solved by eliminating g and r in terms of p, giving a 1-parameter family 
of universes. 

There exists, however, a more elegant and symmetrical solution based on the identity efT/3 4 ein 4 
e'7 = 0. Write p =a + 2b cos(@ + (17/3)) =a + b(ee!7/3 4 ee '7/3) gg =a + 2b cos(@ — (1/3)), r= 
a+ 2b cos(@ — 2). Then p+q +r=1ifa =1/3. Next, using the identity 77/3 + e~2'7/3 4 ¢'* — 0, we 
have p? + q* +r? = 3a? + 2ab(0) + 3(2b*) = 1ifb = 1/3. 

Interestingly, the solution has the following geometrical interpretation. Draw a circle of radius 2/3 
centered at (x, y) = (1/3, 0). Inscribe an equilateral triangle inside the circle and oriented at some suitable 
angle. The projections of the 3 vertices on the x-axis give p, q,r. 

A more obvious geometrical construction would be to go to 3-dimensional Euclidean space and label 
the axes as p,g,r. Then p+q+r=p*+q?4 


+ q* +r? = 1 describes the circle formed by intersecting a unit 
sphere centered by the plane passing through (1, 0, 0), (0, 1, 0), and (0, 0, 1). 
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VI.3 The Schwarzschild-Droste Metric and Solar System Tests 
of Einstein Gravity 


t AN) yp AWB) 4 A? 
2 For example, R tr =~ Tag) + a@E@ + tA@E" 
3 Use the transformation r = p(1+ gu )*. The horizon occurs at p = 3GM , where gog vanishes, and which 


translates into r = 2GM as expected. 


4 Use the transformation R =r — GM with x related to (R, 6, 9) by the usual Cartesian to spherical 
coordinate transformation. 


5 Repeat the calculation in the text using the metric used in the post-Newtonian parametrization. All the 
steps are conceptually the same, but arithmetically, such parameters as 6 and y appear here and there. 


Vi.4 Energy Momentum Distribution Tells Spacetime How to Curve 


4 Using (2), we obtain 
3 Sscalar = 5 / d*x./=8 { o"5g08"°3,09,0 = gl Se, (eo) + vw) 


and hence the stated result. We have in flat spacetime T™ = 0°pd°y + 15 (8p)* + Vig) = 3{(3°~)? + 
(Vy)"} + Vy). 


5 We have 
/ d4x./—g(x)g™ (x)g*? (x) = / d*x' Je! (x')9'F x’) 9? (x') 
= / d*x/—g'(x)g'?§ (x)? (x) 


where the first equality follows from invariance under coordinate transformation and the second from 
renaming the dummy integration variable. Using the leading order expression g/, (x) — 8p(*) = 


—(Syg ©) OEM (X) + Boy (*) gE" (x) + £9; Sng (X)) + O(e?) in (18), we arrive at (20), with T° the energy 
momentum tensor associated with the cosmological constant term. 


7 Recall that 


i , 
wu a | dr | A 
DT’ =a,T +077" +0, TM = =u (Jere) arr 


Plugging in the given form of T“” and suppressing an overall factor of —“ 


V-8(x) 


for the moment, we obtain 


for the first term on the right hand side 


/ dt ees d,54(x X(t)) = / dt cS SM d4(x — X(t)) 


dt dt dt dt dx 
dX” d d?x” 
= dt o4(x — X(t = [ax d(x — X(t 
/ ae ote ( (t)) 7) ( (t)) 
where we integrated by parts in the last step. Putting it together, we find that D,,T“” = 0 gives vx + 
bh axe ae = 0, which is just the geodesic equation of motion. The result is hardly surprising: the energy 


momentum tensor we were given did not drop from the sky but was derived from the action for the 


806 | Solutions to Selected Exercises 


VI.5 


V1.6 


particle, while the equation of motion follows from the very same action. Physically, we expect the energy 
momentum tensor to be conserved only if the particle does what it is supposed to do, rather than moving 
around capriciously. 


Start with0 = D,,T’” = D,,(pU")U” + pU*D,,U”. Contract with U, and use U, D,,U” = 0 (since UU" = 
—1). We obtain D,,(epU") = 0. Plugging this back into the above, we find the stated result U“D,,U" = 0, 
which tells us that the dust particles follow geodesics, as might be expected. See also exercise 7. 


Gravity Goes Live 


According to exercise VI.1.6, in 2-dimensional spacetime Rp, = FR(Srp8pv — 8rv8py) (this follows 
immediately since the Riemann curvature tensor has only one component). Contracting, we find that 
Rey = 8r,R/2 and thus E,,, = 0. 


Initial Value Problems and Numerical Relativity 


To obtain E°”, we have to calculate the Riemann curvature tensor. From R’. ~ 87 + II’, we obtain 
R.... ~ gdT + gIT. Since T ~ g"'dg.., we encounter in dF terms involving 0dg,. and terms involving 
dg" ~ g''dg..g"° (which, if you want, you could express in terms of T and g using the definition of T, but 
you don’t even have to bother for the purposes at hand). Thus, R.... ~ 00g..+ IT. 

Since we are hunting for ae, we could care less about the rT terms. So far so good, but it would appear 
that we still have to slave away to obtain the 04g.. terms. Now we appeal to Professor Flat for help. Go to 
a locally flat coordinate system. Back in chapter VI.1, we obtained R.... locally in terms of the dg... The 
general expression for R.... must reduce to the expression in chapter VI.1; hence we conclude that 


R 


I'l terms 


1 
Tov — 3 (Srv, up Spv, ut 8ty,vp t Spu,vt) t 


where we have switched to the comma notation for partial derivatives: g7y, 1) = 9,,0)8ry. Keeping in mind 
the antisymmetric properties of Repu we see that we won't encounter 800,00 and 8io,00- We are down 
to Rio jo = 4 (Bio, jo — 800, ji — 8ij,00 + 80,01) + IT terms, leading to Rig jo “=” — 58;,00- To save writing, 
we introduce the symbol “=” to mean equal up to terms not containing ap. 

Contracting indices, we find 


a» _1aij a_» 11,0) «_» _1,,00 
Roo “=” — 387 8:ij,00 Roi “= +38 /8ij,00 Rij “=” — 78 8ij,00 
&_” (90i g0j _ 900 gif 
and R “=” (g' gh — ge") 8:7 99. 
Confusio is busily calculating in the corner. That guy does every exercise in the book to set a good 
example for the students. Now he cries out, “But I get Eo) = Roo — 5800R “=” —7(8" + gong% 8! — 


80088") 8:;,00 and this contains 9!” 

Perhaps you could help Confusio out. You chide him, “By now, you should know the importance of 
distinguishing upper and lower indices! In the text, the statement is that E° does not contain a2.” 

We will raise the indices in two steps. First, 


Ry = 8 Roo + 8" Roi “= Be ie — ge) 95 09 
Indeed, 
1 
EO = Ro - sero 


Similarly, 


Eo = R° = gRy, 4 eR; “9 


Now we are almost done: 
£0 — 2p 4 gi p09 and E% = gE + gil £9 “_Q) 


Phew! It’s nice to have the Bianchi identity proof! 
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2 Using the contracted Bianchi identity as in the text, we have 09(E° — T°”) = —d; E'” + terms involving 
the Christoffel symbols, I’ times (various Es minus doT°”). Next, use the field equation E“” = T#” to 
write this as —3,T'” + 'T — dyT°”. If there is any justice in the world, the I's should convert the ds into 
covariant derivatives and turn this into —D,,T"” = 0. 


Vil.1 Particles and Light around a Black Hole 


3 Following the by now familiar steps, we obtain for the radially plunging observer the equations of motion 
dt yo fdr = at dT dr | dT 
+ Uv =1 and v tv = 1 
dt dt dt dt dt dt 
Solving, we obtain es =-v= —,/rs/r and ae = 1. We also see that if we set dr = —vdT in (13), we 


obtain dt = dT. 


4 The path of a radially infalling photon is determined by ds = 0, which implies dT = +(dr + vdT), with 
v = \/rs/r. We thus obtain 4" = —(1 + v) for an infalling photon (and 4" = (1 — v) for an outgoing 


dT — 
photon). This proves that a photon falls faster than the plunging observer. Note that in these coordinates, 
we have, for an outgoing photon, a = 0 at the horizon, as expected. 


Vil.3 Hawking Radiation 


1 From chapters V.4 and VI.3, we have 


dr\? r r 
( ) = (1 5) poee with e=1- s 
dt r rgta 


Integrating, we obtain 


rgta 
At = dr(S — 78 _)-3 = 2(r5a)2 
r rgta 


The time it takes to reach the horizon scales like a2. So Heisenberg tells us that the characteristic energy 


is ~ h/2(rsa) 2 Multiplying by the gravitational redshift factor 


(go0(rs 4 a))? =(1 fs ) ~ Ja/rs 


rg ta 


derived in chapter V.4, we obtain that the characteristic energy measured at spatial infinity is given by 
Ty ~ hi/rs ~ h/ GM. Nicely, the dependence on a cancels out. 


VII.5 Rotating Black Holes 


1 We have dr = fd? + fgd0, where f; = 0f/d7, and so forth. Getting rid of the cross term d7d0 requires 
fg = —c/a, which fixes f up to an arbitrary additive function of 7. We then drop the tilde sign. 


2 We obtain 
2 2 
0 1 
Pads fies cae -o(4) 


arg sin? 6 arg sin? 6 cos? 6 “O 1 
819 ' 3 ! r> 


r F: 
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2 2 gia 
2.2 _@ , arsgsin’é | 1 
Soy > 1° sin o(u a 3 + o(3 


rs a* cos? 6 — a? + re ars cos? 6 — 2a*rg + re 1 
8rr > (14 t t 3 + O oe 


r r2 


Vill.1. The Dynamic Universe 


1 Use the fact that p = po(Ro/R)* to write R* = te —1, where we define the time scale T by T? = 
(4/3) (820 GpoR}). Then the solution is R(t) = /t(T — ft). For small t, R(t) « 12, in agreement with what 
we had in the text. The universe ends at time T. 


3 In this case, p = po9(Ro/R)>. Define T = 42 GpoR3/3, so that the cosmological equation becomes R= 
ote — 1. The solution is given parametrically by R(7) = T(1— cos n) and t(n) = T(y — sin n). For small 
z 
n, RX Tn?/2 and t(n) ~ Tn3/6, so that in the early universe, R(t) « 13. 
Amusingly, the resulting curve R(t) is a cycloid, namely the curve traced by a point on the rim of a 
rolling wheel, with R(7) corresponding to the height of the point and t(7) to the distance traveled by the 
wheel. 


VIIl.2 Cosmic Struggle between Dark Matter and Dark Energy 


1 Set Qm,o = Land Q, 9 = Qa ,o = 0 in (32) to find tage = = te da a™?. Froma « 1/13, we have H = -2. 


Plugging into (7), we have p = (g2g)() and thus the stated result. 


2 Set Q,,9 = Land Quo = 2a,o = 0 in (32) to find tage = Te So da a. From a x 1/t2 we have H = —z. 


Plugging into (7), we have p = (sg )(@) and thus the stated result. 


3 Setting R=0in (VIII.1.16), we obtain the condition p + 3P =0= p,, — 2p, and thus p = py t+ Pp, = 
3p, = 3A. Setting R=0 in (VIII.1.18), we obtain k = 1 necessarily and the radius R = (gig)? a 


1 
(srGR ) 3. 
4 From Q, = —a3—7, we have Q(z) = Q, (1 + 2)*(Ho/H (z))?. Using (15) and a = (1+ z)~}, we obtain 
the stated result. 


5 According to (32), we have f dt = (Ho)~! f da(Qm,9a7! — [2,.9la2)-2, and thus the expansion of the 
universe stops when the denominator in the integrand, which is in fact proportional to H, vanishes. For 
a larger than dmax = (Qm,0/(-2 4.0)! 3 the denominator goes negative. The time necessary to go from 
max to the Big Crunch is thus given by 


Bun 


0 i q, 
(Hy [day (Sm,o47 = 1@a,ole?) ? = — > f 
0 


1 
ax 24 
2\~2 a 
7 du (array u?) 
max 3Ap|Qy ol? 


3HolQq, ol? 


The total life of the universe, as measured from the Big Bang to the Big Crunch, is thus Tyniverse = 
2a 


—+. 
3H |2q_, 1 2 


IX.1 
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Parallel Transport 


Let the rectangle be of width a and b with its “lower left” corner at (x*, y*) and its “upper right” 
corner at (x* + a, y* +b). We arbitrarily pick the counterclockwise direction to integrate in. Then a*” = 


fie dx(y* — (y* + b)) = —ab. Asa check, a?* = ae dy(x* +a—x*)=ab. 


x 


Denote the vertices of the triangle by (A, B, C) and the corresponding interior angles by (a, b, c). The 
sides of the triangle are straight lines, of course, namely geodesics. Along a geodesic, the tangent vector 
is parallel transported, and so, along each side of the triangle, the angle between the vector we are parallel 
transporting, call it S, and the tangent vector remains constant. Let’s say we are moving along the side 
CA. When the tangent vector reaches the vertex A, it turns through an angle of ( — a) to point along the 
side AB. Thus, when we get back to where we started, what we call the tangent vector has turned through 
an angle of (7 — a) + (w@ — b) + (a —c) = 3m — (a+b+c)=m —(a+b-+c). Since the angle between 
S remains the same, S has also turned through z — (a+b +c). Thus, the angular excess (a +b+c)—2 
measures the curvature. 


Geodesic Deviation 


Plug y(t) = x"(t) + €"(r) into (IX.3.2), expand in e, and subtract (IX.3.1) to obtain 


PIN) v v ru 
d‘e ; r9,0% dx” dx ; a dx” de (33) 
t dt d dt dt 
Note that det is not a vector; indeed, none of the terms in (33) is a vector. 
We would like to rewrite this as an equation between vectors, and so let’s evaluate 
De& — d (det dx” dx” (de®  _, dx® 
ae (Sore) erg (Same) (34) 
Dt2 dt \dt dt dt \dt nds 
After the differentiation is carried out, the first two terms in (34) become 
a Lb dx? dx” da vi. dx’ d y 
- (a, va) ae, an ae Pry = (35) 
dt? dt dt dt? dt dt 


Using (33), we can write the first term in (35) as 


dx” dx* dx” de* 
— (€?a,T%, + ar 
(< ew dr dt YA dr dt 
The second term in this expression knocks off the third term in (34) and the fourth term in (35). Next, 
using the geodesic equation in (IX.3.1), we can write the third term in (35) as 


iene Oat 
vA oP dr dt 
Collecting terms, we find that (34) becomes 


D6 dx” dx* dx? dx’ 4 pw ax? dx? 4 


dx” dx? 
€ f iH & Bi wh a o 
Dr2 pov dt dt (8 v1) dt dt vr" oP dr dt 


VY dr °° dt 


(36) 


Renaming indices and recalling the definition of the Riemann curvature tensor, we watch with satisfaction 
the terms on the right hand side gathering themselves into a particularly nice form: indeed, what else 
but (IX.3.6)? 

This calculation, while slightly tedious, shows quite clearly where the dT and the IT terms in the 
definition of the Riemann curvature tensor R’. ~ 0.0", +1°'., come from. 
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The stated energy conditions are invariant under V“ — aV“, and so we could normalize the arbitrary 
timelike vector V“ by V“ V,, = —1. Then, with 


Ty =(9+ P)U,U, + Pay, wehave T,,V“V’=(p+ P)\(U-V)?—P 
Go to the frame in which 
U" = (1, 0,0, 0)/,\/—go9 = sothat = (U-V)* = —gg9(V°)* 


which ranges from 1 to oo for an arbitrary timelike vector V“. Thus, for the weak energy condition, we 
obtain p > 0 and (p + P) > 0as claimed. 
Next, 


TyyV" = (0 + P)U,U -V) + PV, 


Thus, for the dominant energy condition, 


— 8"? (TyyV")(Tpo V7) = (9 + PU «VY —2P(p + P)U -V)° + P? > 0 


For (U - V)2 equal to oo and 1, we obtain p*> P* and p>0, respectively. 
Since T = —(p — 3P), the strong energy condition says that (9 + P)(U - Vy—P> (p — 3P). For 
(U - V)? equal to oo and 1, we obtain (p + P) > 0 and (p + 3P) > 0, respectively. 


Linearized Gravity, Gravitational Waves, and the Angular Momentum 
of Rotating Bodies 


Start with the equation of geodesic deviation 


D*s# pw dx® dx? o 
Dr? * dr dt 


(where I have changed the separation between the two particles from € to s to avoid possible confusion 
with the polarization vector of the gravitational wave). To leading order, ae = (1, 0) and so 


Now I ~ O(h) and so we have 

Roop = plo, — TK, + Oh) — giving Ro, ~ AIO, 

since If) vanishes to this order in the TT gauge. Furthermore, since in TT gauge, fo, vanishes, we have 
Thy X Fn doh ay = Fdht 

Thus, 


2 
He xg 2 pe 
002 2 ar2 Xr 


and we obtain to leading order 


dst ~ 1 (Su) gt 
dt2 2 \ar2 * 


For a plane wave Ayy = Eyy(k) sin(ot — kx) moving along the x-axis, 


d?st! : 
i ioe —jarels* sin(wt — kx) 
t 


IX.6 


3a 


3b-c 
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For example, for the plus polarization, 


ds 
dt? 


d2s* 
dt? 


a fore s* sin(wt —kx) and ~+ jore,s? sin(wt — kx) 


with the relative minus sign between the two equations characteristic of a tidal force. 


Under the stated conditions, (10) reduces to Wh uv =9, which has the immediate solution h w= Ruy /r, 
with k,,, some constant tensor, as is familiar to students of elementary physics. Now impose the harmonic 
gauge condition d¢h pv =9. 

If you started to write k,.,d"(1/r) = 0, stop! You are making an error. This is the subtlety I alluded to 
in a footnote in connection with (4). Think about how (1) was derived in chapter VI.5: we used 3,7, = 0 
repeatedly. Thus, we must take for 7 a the diagonal matrix with diagonal elements (—1, 1, 1, 1) rather than 
(—1, 1, r?, r? sin? 9). But since the problem has spherical symmetry and since r has already appeared, it 
is easy to fall in the trap of deducing that k,.,, = 0. 

Instead, we have k,,,0“(1/r) =0= —k;,x!'/r3 and hence k;; = 0 and kjo = 0. (In other words, a“ in 
this context refers to Cartesian coordinates, not spherical coordinates.) Since k,,, is symmetric, only ko9 


is nonvanishing, and hence the only nonvanishing component of f,,,, is hog = 2s/r, where, with the 


pv 
malice of afterthought, we have renamed koo = 2rs. Note that h = —2rs/r. We next have to go from h,,, 


tohyy = hess - 5Nuwh. We obtain 


hoo =hoo t+ Gh=rs/r, hig =—nijrs/r 


Thus, 


ds? = (1 8) ar (1 ‘s) (ax? + dy? 4 dz”) = (1 8) ar (1 ‘s) (ar? do?) 
1 ep r r fi 


What? You exclaim that this is not the Schwarzschild metric—the coefficient of dr? and r2dM? are the 
same. After all this work? 

But this agrees with (19) with fo; =0, so our result is in fact correct. The resolution is that we 
can perform a coordinate transformation. Call the coefficient of d Q?2 in the ds2 given above R2. So, 
R2=(1+ "yp? thatis, RX&r+ iTs. Then dR ~ dr to this order and we have 


We obtain the Schwarzschild metric 


ds? ~ (1 ‘s) dt? 4 (1 ‘s) dR? + R2dQ? 
R R 


to leading order, as expected. 


Isometry, Killing Vector Fields, and Maximally Symmetric Spaces 


For a scalar, (19) collapses to €*,S = 0. Since this holds for all Killing vectors, we have 4,5 = 0. 


At some arbitrary point X, consider the D(D — 1)/2 rotational Killing vectors, for which €"(X) = 0. 
All subsequent statements are meant to hold at the arbitrary point X (and thus hold everywhere). The 
covariant derivative of these Killing vectors simplifies to ¢“, = 0,¢* (since €” = 0 at that point). Write this 
as 0,¢4 = g#é,.. Thus, we can write the first term in (19) as 


Tyg 8 bap = TE cboxp = TG nose 


(The insertion of the Kronecker delta in the last step is merely for later convenience.) Similarly, we can 
write the second term as Dial, Costs and so on, until we get to the last term, which vanishes, since 
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é"(X) = 0. Thus, at the point X, we have 
¢ c a 
eae ao T° 85 he ‘) Sot =0 


Since &,.- span the basis of all D-by-D antisymmetric matrices, we conclude that the parenthetical 
expression in this equation must be symmetric under w < ¢. We now apply this result to various specific 
cases. 

For a maximally form invariant vector, we have Vos = Vie. Contracting ¢ and p, we have (D — 
1)V° = 0, thus proving the stated assertion. 

For a maximally form invariant 2-indexed tensor T,,,, the expression in parentheses above simplifies 
to 


(Tess + 7,35) = (755° + 7,552) 
Contracting ¢ and p and lowering , we obtain (D — 1)T,,, + Ts = 80 T?. Decompose Tig = Sug + Awe 
into its symmetric and antisymmetric parts. The preceding equation then becomes 

DS yo + (D — 2)Aags = 8a05p 
giving us 

(D-—2)Agg=9 and DSy5 = 58a 


with s = St. For D #2, A, = 0. 
It remains to show that s is a constant. Plug the result T,,, = So = 88 (absorbing a trivial factor) 
back into (19). The last term begets two terms: 


EP, Tyq = E04 (680) = (E80) 8 + Spa (E*28) 


The term (€*3,g,,,)s combines with the other two terms in (19) to yield zero, thanks to the Killing 
condition (2). We are thus left with €*d,s = 0, which implies that s is constant. 

Finally, we have to deal with the special case of D = 2, for which A,,, needs not vanish. Indeed, besides 
the metric tensor, we have the form invariant 2-indexed tensor Env VJB namely the Levi-Civita tensor. See 
chapter X.5 for further discussion. 


Differential Forms and Vielbein 


We have e! = f(y)dx and e? = g(x)dy, so that 


ol = i 3 sO) dy andso R! 


ba ms = ‘Ga FO) aay 
g(x) f(y) f(y) gg) 


Converting to world indices, we find Ryyxy = —(g(x)g"(x) + fO) f")). 


We have dF = 59) Fyydx*dxtdx” = 0. Since dx*, dx", and dx” anticommute, this is equivalent to 
€°#"9, F’., = 0, which you should recognize as an identity (since F,,, = 3,,A, — 0,A,,) and as the “other 


half” of Maxwell’s equations. 


We have e! = Qdx and e* = Qdy. Since we are dealing with a space rather than a spacetime, we have 
Euclidean indices a, b, - - - rather than Minkowski indices, and we can freely move indices up and down. 
Thus we write, for example, 


de! wo? Qydydx (2,/ 2”) ele? 


where we use the notation O, = 4,0 (and similarly O,). Thus we have w!? = (Q,/)e! + (e? term). By 
symmetry, w= —@!2 = (Q,/ 2) e? + (2 term), and hence we obtain 


wo = (Qye! — Qye?)/Q? = (Qydx — Q,dy)/2Q 
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Thus we have 


RY? = —(Q,/Q, + (Qy/Q)y)dxdy = —(V? log Qete?/ 2? 


The curvature is therefore given by 3R Ry = Ry = Rhy (V? log Q)/Q?. 
As a check, we learned from exercise 1.5.13 that the metric for the sphere can be written as ds? = 


(dp? + p2d0?)/(1+ or y2, Plugging in Q = (1+ ay we obtain R = 2, as expected. 


To be concise, we will abuse notation as noted below. Also, as noted in exercise 3, since we are dealing 
with a space, we have Euclidean indices a, b, --- , which we will freely move up and down. World indices, 
of course, have to be moved using the metric. From e* = Qdx* (this is already notational abuse: strictly 
speaking, we should write e* = 64dx' to distinguish between the world index i and the Euclidean index 
a), we have de® = 3, Qdx'dx* = —w%e? = —w Qdx?. We will write Q; = 9;Q. Using the fact that wo” is 
antisymmetric, we obtain 


o® = O71 (2,dx" 2 Q,dx") 
and hence we have 

do® =97} (Qhedx°dx4 - Qyedx“dx") ~ 97 (2),2.dx°dx" 7 2,2-dx°dx") 
Also, we have 

oo? = Q7* (Q.dx" — Qydx°) (2,dx° — 2.dx") 

=Q° (2,2,dx°dx" ~ 2,2.dx%dx — 2,2,dx*dx") 
Putting this together, we obtain the curvature 2-form 
R® = do” + ow 


=Q7! (2j.dx"dx" — Qyedx“dx") 
297? (%,2,dx‘dx" = 2,2,dx°dx") ~ 972, Q.dx4dx? 


It is instructive to compare our work here with that in exercise 3. Note in particular that the nonabelian 


term w““w is absent there. 
b 


Next, we have to extract the Riemann curvature tensor Re 4 defined by Re = 5R4 3 een So, replace 


dx° by Q-'e° in the expression for R” above and read off 


Rated = (G2 (Wpe5ad — Lacdnd) — 2L~* (MeBaq — LaMe5pq) — 242 jQFaedpa) 
—(c<d) 


From this we obtain Ry. = Rapcddpq and R = R,-5“°. To obtain the “usual” Riemann and Ricci tensors, 

remember to convert Euclidean indices to world indices with the vielbein, thus for example, Ruy = 

aa . 
Q; that is, 


ene Rab = 2? Ryp50 50. Keep in mind that with our abuse of notation, we have Qi) = 53 33 


v 
the indices on Q,, in the expression above are world indices to begin with. 
Following this procedure, we obtain the Riemann tensor (which we won't display, since it can be read 


off from what is given above), the Ricci tensor 

Ryy = 2(d — 2)2778,,28, 2 — (Sun (2-2 4+(d- 3)2°70,3°@) 4 & 2)2719,,9,2) 
and the scalar curvature 

R= -2(d — )239Q — (d — Yd — 492-44, Q4“2 


(Note that the world indices are raised and lowered with the Euclidean metric 6,,,,.) 

This exercise suggests a relatively easy way to obtain the results of exercise VI.1.13. We simply promote, 
in the expressions given here, 5,,, to g,,, and the partial derivatives to covariant derivatives. Finally, the 
leading term in the expressions given in exercise VI.1.13 can be determined by setting Q to a constant. 
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IX.8 


For example, there we had R=2-?R+.---. The factor Q~? is determined by noting that for Q constant, 
we are simply scaling the coordinates by Q. 

Finally, we can check the results given above. For the d-dimensional sphere, Q = (1+ ee and we 
obtain R = d(d — 1), as we have long known. For anti de Sitter spacetime, write x! for convenience as x, 
then Q = 1/x. The only derivatives we have to calculate are then 0,Q = —1/x? and 37Q = 4,d,Q = 2/x3, 
and the messy expressions above collapse nicely to R,,, = —(d — 1)4,,, and R = —d(d — 1), in agreement 
with what we will learn in chapter IX.11. 


With e! = dr and e? = f(r, 0)d0, we obtain de! = 0 = —we?, which implies w!? « e?, and 
de? = (4, f)drdé = (8, f/fyele? = —we! 

We obtain 
wo? = —(3, f/ fe = —(8, f)dd 

Thus we have 
RY? = do = — (a?f) drdo =— (a? f/f) ele” 


from which we find the scalar curvature R = 2R},, = —a? S/f. 

Let us check this against the two examples given in chapter II.2. For polar coordinates on the plane, 
we find f(r, 9) =r and indeed R = 0. For spherical coordinates on the sphere, with suitable renaming 
of the coordinates, we have f(r, 6) =sinr, and indeed R = 1. 


Differential Forms Applied 


We have e! = dr ande* = f(r)d6, sothatde! =0= —ole’, which tells us that ol, is proportional to e2, and 
de* = f'(r)drd@ = -o*e}, which tells that wo, = f'(r)d0. Use the antisymmetry to see that wo, = -ol, 


and so wo, cannot contain a piece proportional to e!. We then obtain 
cH 
R = dw”, = f"(r)drdo = LO a2 
fr) 


a, f(r) 
RY, 5: Rag = — fu) 


a 


en ac TO 
_ _2f"@) 
— £@) 
Setting R = 2C gives us the differential equation f” = —Cf, whose solutions are given by either trigono- 


metric or hyperbolic sine and cosine, depending on whether C is positive or negative. 

But now the global condition 0 = 6 + 2z tells us that for the space to be locally flat as discussed in 
chapter I.6, we must have f(r) > r as r > 0. This not only fixes f(r) but also requires C to be either 1 
or —1. (Around the tip of a cone, we could have f(r) — kr as r + 0 for k < 1, but then the coordinates 
and the curvature would be singular at the tip and our formalism breaks down.) For positive constant 
curvature, f(r) = sin r, and so the space here, as you have already seen in chapter I.5, is just the sphere 
in disguise (with the usual coordinates 6 > r and y > 8). For negative constant curvature, f(r) = sinhr. 
Satisfyingly, this agrees with the result you got for exercise 1.5.5. 

It is also instructive, noting that a circle of radius a centered at the origin has circumference 27 f (a), 
to use the mites’ formula introduced in the prologue: 


R= lim 1 Eituin iene oe 6 1 sinr = 6 (Pr 1 
radius>0 (radius)2 2m radius rs0 2 r r2 \3!r 


(We now see that the mite professor of geometry included the overall factor of 6 so that the unit sphere 
would have unit curvature.) 
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We have e! = y?dx and e? = x? dx. Then 

de! = py?dydx = —pele?/(yx’) = —o',e 
and thus 

w', = pe'/(yx?) + (e? term) 


By symmetry, 


= pe*/(xy?) + (e! term) 
and so 
oh, = —0% = p (el/ (yx”) — €/ (xy) = p ((y? Maxx?) — (x? Mdy/y?)) 


the last form being easier to differentiate. Differentiating, we have 


RY, = dat, = 0% = —p(p — 1) ((x?-*/y?) + (y?-*/x?)) axdy 


=—p(p — 1) ((1/ (x2y”)) + (1/ (y2x?”))) ele? 


Since R4, = = 5R4 ede e?, we find R! 212 = —P(p - 1)((1/(x2y??)) + (1/(y2x??))) = Roo. Thus, we finally 
obtain 


es aoe | Hs 4-4 il ) 


x2y2 x2-)) | yp-D 


The space is flat for p = 0 (Pythagoras) and for p = 1 (see exercise VI.1.17). For p = 1/2, R= s(t + 3. 


We have e! = ad0, e* =(L+asin6)dg, and so de! =0, telling us that w!* « dg, de* = a cos ae = 


—w?'e! = —w* ad, so that w?! = cos Ody. Thus we have R*! = dw?! = — sin Ady arte e 
an : 
giving R3} = aLiasmdy = R11 = Rap. Hence we obtain R = 6 Ry, = ae nney* ee that, as might fe 


The “outer half” 


expected, atO6 =O oraz, R=0;at@=7/2,R= seat and at 0 = 37/2, R= -zea a: 


of the torus has positive curvature, while the “inner half” has negative curvature. 


ps =dt, e'=A(t)dx, e =Bi(t)dy, b= eee we have de® =0 = —o? e4, so that wo = (e* term, 
no e° term). Next we have de! = Adtdx = (A/A)e%e! = —ae° -—@ Ve, which implies wy =(A/A)e! = 
w%. Here a possible e° term in w', is disallowed by our earlier tenclasion. Note that while it is possible 
here for «1, to be proportional to e”, this would imply that w, is proportional to e!, but this is ruled out by 
the antisymmetry of w!*. (Note that we have used repeatedly the fact that the metric is diagonal, so that 


we can raise and lower indices easily.) We thus conclude that the only nonvanishing components of os 


are wo, = wy, = (A/A)e! = Adx and the components obtained from it by permuting 1, 2, 3 and A, B,C. 


We next obtain 


- oe 1 
R®, =do® + 0° 4, = Adtdx + 0 = (A/A)e°e! = 5 Rape eve 


and hence we have R°), = =A/A=— R'h40- Since @', = 0, we might be tempted to conclude that R!, = 0 
also, but we would be wrong. In fact, 


R\, =do', + wo, + o| 4, = 0+ (A/A)(B/B)ele? +0 
and hence we have 


Ry = (A/A)(B/B) = R49, 


All other nonvanishing components of R% are obtained from R® and R\, by permuting 1, 2, 3and A, B,C. 
Finally, R,g (note that this is not a form and is not to be confused with the curvature 2-forms we had 
before) is given by 


Roo = — Ryo = (A/A) + (B/B) + (C/C) 
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IX.9 


and 


Ri = Rog + Rg + R431 = (A/A) + (A/A)(B/B) + (C/C)) 
(Again, the other nonvanishing components of Ryg are obtained from R® and R}, by permuting 1, 2, 3 
and A, B, C.) 

So the Einstein field equation Ryg = O consists of 4 equations for 3 unknown functions A, B, C, but we 
know that 1 linear combination of the 4 equations corresponds to the Bianchi identity. By inspection, we 
see that power laws A=1t?, B = 17, C =1" solve these equations: Roy = 0 gives p+ q +r=p*+q?+r?, 
and R,, = 0 gives p?+ p(—1+q +r) =0 implying that either p =O or p+q+tr=1 

We can easily extend this to higher dimensions. For ds* = —dt* + f,(t)dx? +--+, we obtain 


Ro =— >> folfa and Rag =(fa/fo)+ (fol fa) > fol fo) 
a b#a 


Einstein's field equation is solved by f,, = 1?«, with p, satisfying )., p? = -, Pa = 1. 
Let us also ask what happens when d = 2. Then R&, contains only 1 component, namely R°. Hence 
A = 0and after shifting the origin of t and so forth, we have ds? = —d?? + t2dx?. Analytically continuing 


and giving the coordinates more familiar names, we see that we have found the plane in polar coordinates. 


Conformal Algebra 


Simply calculate n,,,, aes - af) ) (<}/33 — aie), If the two points are null separated, (x, — x2)? > 


0 and hence remains 0 under inversion. 


For the Euclidean plane, (1) simplifies to €,, + &,, = 2cé,,, with c some constant. Then &,)=c, 
with the solution €, = cx + f,(y). Similarly, & . =c gives & =cy + fy(x). Plugging into €.+&1=0 
gives f,0) + A) = 0, and hence f; =a,y + by and fy = ax + b2, with a, + ay = 0. Therefore, €, = 
cx + ay + b; and & = cy — ax + b. The three conformal Killing vectors are € = (b,, bz) (translation), 
é = (y, —x) (rotation), and € = (x, y) (dilation). 


IX.10 De Sitter Spacetime 


X.3 


The event horizon of the observer at the south pole is given by the diagonal t = y — 3. Plugging this into 
(44), we obtain T = tant, r = 1, and W = — tan t, and thus the intersection of T + W = 0 andr = 1. The 
event horizon of the observer at the north pole is given by the other diagonal t = 5 — y. 


Effective Field Theory Approach to Einstein Gravity 


Lorentz and gauge invariance allow us to construct, in addition to the mass dimension 4 Maxwell scalar 
F,,,F"", also the mass dimension 6 scalar 4, F,,,0*F“”. Moving further up in mass scale, we also have 
the mass dimension 8 scalars (Fy Fe? and (Fy FM? (where By =— 5 nips F?°?, as you might recall 


from chapter IV.2). If we add them to the action to form 
x 2 
S= / d*x (<2 + al7d,F,,,0* FY +--+ "4 (« (FyyFeY + y (FuvF*”) ) Seceaindh cA, J") 


we have to introduce, by high school dimensional analysis, two lengths / and J’, which a priori may not 
be the same. Here a, f, and y are dimensionless numbers. Since we understand quantum electrody- 
namics (but not quantum gravity), we can in fact determine all these unknown quantities (see QFT Nut, 
chapters III.7 and VIII.3 and p. 460). The lengths / and /' are set by the electron mass. 
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X.5 Topological Field Theory 


3 Let us evaluate 


ROB RY? dx"dx"dx¥ dx® 


Bry — 
sRVR” =e toy hen 


Eupy apy 
wT y 4 
= Eapysenener es RO, R wae x 


RPO RE chvyogsy 


= (det €)Eper¢ ev Wey 


ax d*x./g (sHoy bv 5e a permutations) Re ase 


=\aave) aes 


ARM Ryy + RYT Rao) 


In the next-to-last step, we used the fact that ¢,,;¢6""" is equal to J 


+1 or 0 according to whether the two 


sets of indices (po tf) and (vw) match or not, up to some permutation. In the last step, we add up 


the various possibilities. The combination (R? 


— 4RYR,,, + RY? Ri) is known as the Gauss-Bonnet 


term. Incidentally, this computation shows eloquently the advantage of using differential forms. 


X.6 A Brief Introduction to Twistors 


2 This is worked out on p. 493 in QFT Nut, 2nd edition. 


3 This is worked out on p. 509 in QFT Nut, 2nd edition. 
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Page numbers followed by letters e, f, and n refer to exercises, figures, and notes, respec- 


tively. 


}-factor, Einstein’s field equation, and metric tensor 
formalism, 76 

“1-2” test, 326 

1-forms, 599-600; Hodge star operation on, 724 

(2+1)-dimensional spacetime, Chern-Simons term 
in, 721 

2-D metric, 77 

2-dimensional solid state structures, gauge potential 
of, 721 

2-forms, 601 

2-indexed tensors, definition of, 53 

2-manifold, without boundary, 727 

3-D metric, and black holes, 77 

3-dimensional spaces, embedding into Minkowskian 
spacetime, 634 

3-spaces, maximally symmetric, 610 

3-spheres: cosmological principle, 491; metric tensor 
of, 296 

3-vectors, transformation into 4—vectors, 218 

4-current, 251 

4-dimensional electromagnetism, 720-721 

4-dimensional spacetime, 386; divergence theorem 
generalized to, 386 

4-gluon scattering, 738, 744e 

4-momenta: in electromagnetism, from special 
relativity, 245; lightlike, 782; of particles in box, 
227 

4-vectors: from 3-vectors, 218; length of, 182; 
relativistic curl of, 252; spacetime metrics, 181 


4-velocity: around black holes, 414; of finite sized 
object, 716 

5-dimensional Einstein field equations, for 2-brane 
model, 700 

5-dimensional scalar curvature, 684—685 

5-dimensional spacetime. See Katuza-Klein theory 


Abbott, E. A., 671 

abelian gauge theory, 681n 

Abraham, Max, on Newton gravity and Lorentz 
invariance, 580 

acausality, of universe, 754, 783 

“accelerated” thought experiment, 280-283, 286 

acceleration: and curvature, 554; Galilean 
transformation, 276-277; Galileo’s law of, 140; 
and general relativity, 189; and gravity, 269, 271; in 
Minkowski spacetime, 190; relativistic particles, 
277 

accretion disks, 414-415; around Kerr black holes, 
474 

acoustic peak, microwave background, 523-525, 
788n 

action: for 2-brane model, 700; constraints in 
varying, 755-756; containing two powers of time 
derivative, search for, 338-339; different sectors 
of matter action, 382-383; dimensions of, 346; 
at a distance, of Newton’s gravity, 145; Einstein- 
Hilbert (see Einstein-Hilbert action); for elastic 
medium, 771; electromagnetic, 244, 250-251, 333; 
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action (continued) 
for everything else, 347; for fields in spacetime 
described by a metric, 770; in flat spacetime, 379; 
formulation by metric or vielbein, 785; of free 
particle, 162; gravitational time dilation, 284; for 
gravity, 339, 344, 346; as infinite series of terms, 
766; Katuza-Klein, in Jordan frame, 686; length 
or energy scale dependence of, 710; local, 246; 
Lorentz, in Katuza-Klein theory, 678; of matter 
(see matter action); Maxwell, 325, 332, 675-676; 
for motion of finite sized objects, 714-715, 716; 
Newton-Einstein-Hilbert, quantum gravity limit 
of, 444; Newton’s law of action and reaction, 470; 
nondependence on metric, 723; nonlocality in 
time, 754; nonrelativistic, 241-242, 356; offshell 
information carried by, 782; reasons for emphasis 
on, 396; relativistic, 284-285, 308; relativistic 
string, 210n; scalar field, 332; specification of 
dynamical variables, 395; terms of, behavior at 
long distances, 722; topological, 720-721; total, 
Newtonian world, 145; of universe, 346, 356; as 
usually formulated, 783; of world, and energy 
momentum tensor, 378; Yang-Mills, 681 

action functional, 138 

action principle, 155; basics of physics, 136-149; 
different notions of, 138; as fundamental principle 
of theoretical physics, 783; globality of, compared 
to equation of motion, 141; kinetic term in, 140; 
and least time principle, 139, 144; metaphor for 
life, 140; mystery of, 141, 155; of particles and 
fields, 145; theories based on, 383; variational 
calculus, 113 

action variation, holding dynamical variables fixed, 
380 

active diffeomorphism, 397 

actual biological time, elapsed between event A and 
B, 179 

addition of velocities. See velocities 

ADM (Arnowitt-Deser-Misner) formulation, of 
gravitational dynamics, 693 

AdS. See anti de Sitter spacetime 

affine parameter, 308 

Aharonov-Bohm effect, 789n 

air resistance, and free fall, 268 

airline example, for proving curvature of earth, 66 

al-jabr, calculation method, 208n 

Al-Khwarizmi, calculation of square roots, 207n 

algebra: conformal, 614-623; de Sitter, and 
cosmological constant, 755; extensions of, 667; 
Lie (see Lie algebra); Lorentz (see Lorentz algebra); 
matrix, quick review of, 742-743; Poincaré (see 
Poincaré algebra) 

algorithm, etymology of the word, 207n 

ambitwistors: power of, 738; representation of, 736 

American football, relativity of, 171, 172f 

“analog Newtonian” equation, 367 

analytic continuation: de Sitter to anti de Sitter 


spacetime, 664; hyperbolic coordinates, 661; of 
stereographic projection, 641 

analytic geometry, role of coordinates, 48 

Anderson, Phil, on particle physicists, 713n 

angles: defined by physicists, 170; hyperbolic, 628; 
importance of, 620 

angular coordinates: on de Sitter spacetime, 627; 
suppressed, 422, 426 

angular correlation, cosmic microwave background 
fluctuations, 523f 

angular deficits: as “measure” of curvature, 727; of 
polyhedra, 726-727 

angular momentum: around black holes, 412-413, 
459; conservation of, 30, 36-37, 48n, 126, 152, 
310; Kerr black hole, slow rotation limit, 571; loss, 
Penrose process, 471-472; of particle on sphere, 
148; of rotating black holes, 442, 465, 576; of 
rotating bodies, 563-577; symmetry of, 150 

angular velocity: around black holes, 414, 460; 
defined by time coordinate, 550; for Kerr black 
hole, 462f; slowly rotating gravitational sources, 
570; inside stationary limit surface, 471 

annihilated spacetime, 785 

annihilation operator, 447-448 

annus mirabilis, Albert Einstein’s, 265 

ant and honey analogy, 5-6, 5f 

ant movement, as example of variational calculus, 
128 

anthropic principle: and cosmological constant 
paradox, 751-752, 757; and ultimative theory, 
789n 

anti de Sitter / conformal field theories (AdS/CFT): 
AdS/CFT correspondence, 649, 787; conformal 
coordinates of, 654, 654f; and Poincaré half plane, 
68 

anti de Sitter spacetime (AdS), 606e, 612, 649- 
666; for 2-brane model, 702; AdS? boundaries, 
664; boundary of, 655; d-dimensional, 650, 650f; 
different forms of, 660; in hyperbolic coordinates, 
661; isometry group of, 650; motion of light, 659; 
motion of massive particles, 659-660; Poincaré 
coordinates, 656; slice of, 658f; stereographic 
projection for, 661; table for, 662 

anti-gravity, discussion of, 392 

anticommutation: of differential forms, 597; Jordan’s 
manuscript on, 789n 

antimatter: and charge conjugation in Katuza- 
Klein theory, 678; creation of, 205, 206; in early 
universe, 528; Feynman diagram of, 206f; in 
higher dimensional theories, 683; in quantum 
field theory, 476 

antiparticles, 26, 437-438 

antipodal condition, space of spheres, 646 

antisymmetric matrices, introduction of, 40 

antisymmetric symbol: in curved spacetime, 723- 
725; as invariant tensor, 60; role as metric, 734; 
used to contract indices, 719 


antisymmetric tensors: character of, 55; 
decomposition of, 236e 

antisymmetry: useful relations based on, 608. See 
also symmetry 

apparent singularities, at Schwarzschild radius, 409 

apparent violation of causality, in brane models, 703, 
705 

apple: falling, 36, 137£, 268; floor rushing up to meet, 
270£ 

area: approximation by small rectangles, 546; 
infinitesimal, enclosed by closed curves, 547; 
Planck, and entropy of black holes, 442 

area and volume, concept of, and coordinate 
transformations, 75-76 

area theorem, Penrose process, 472 

area transformations, in differential forms, 598 

Aristotle, comparison to Newton, 140-141 

arithmetic, difference from mathematics, in terms of 
rotations, 56 

arithmetic laws, of working in general relativity, 665 

Arkani-Hamed, modified Einstein’s field equation, 
754 

Arnowitt-Deser-Misner (ADM) formulation, of 
gravitational dynamics, 693 

arrays, and vectors, 51n 

astronomy, with gravitational waves, 563 

astrophysical objects: mass and energy for 
gravitational waves, 569; Schwarzschild radius to 
actual radius relation, 366 

asymptotic safety, as approach to quantum gravity, 
760 

atomic clock, 287 

atomic physics, in early universe, 518 

atoms, action of, 714~715 

attractor, stable, in cosmic diagram, 511f 

auxiliary fields, 217n 

auxiliary quantities, calculus, 129 

averaging, for many particles, 231 


baby string theory, 215; and Lorentz transformation, 
147 

Babylonian tablet, 214, 214f 

background radiation. See cosmic microwave 
background 

bad notation alert: confusion in time dilation, 198; 
confusion in relativistic action, 211; geodesic 
equation, 555 

balls: circularly arranged, falling toward spherical 
planet, 58-59; separation between falling, 554; in 
train, 160-161, 161f 

baryogenesis, 526-528 

baryonic matter, 502-503, 506 

basic vector: spacetime metrics, 181; (ur-), definition 
of, 43 

basis vectors: change by moving on surface, 99-100; 
for surface, in Euclidean space, 98; variation of, 
100 
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Beer, Gillian, on Lewis Carroll, 173n 

Bekenstein-Hawking entropy, 441-442, 444; second 
law of black hole thermodynamics, 472 

Beltrami, Eugenio, and discovery of Poincaré’s half 
plane, 67n 

bending of light: “accelerated/dropped” gedanken 
experiments, 281-282. See also deflection of light 

Bentley, Richard, on existence of God, 520 

Bering Strait, “attractive force” of, 275 

Berlin Wall, construction of, 476 

Bernoulli, Jacob and Johann, brachistochrone 
problem, 120 

Bessel, Friedrich, Bessel functions, 376n 

Besso, Michele, letter of Einstein to son of, 177 

Bethe, Hans, and Peierls’ comments on thinking and 
calculating, 133 

Bianchi identity, 452; constraints on curvature 
tensor, 592; contracted, 393, 394; derivation of, 
392, 393; and Maxwell’s equations, 724; similarity 
to differential forms, 599 

Big Bang, 785; analyzed with cosmic potential, 508— 
509; in cosmic diagram, 502-503; and cosmic 
microwave background, 517; in cosmic potential 
diagram, 508f; as creation of space, 498-499, 708; 
as point of infinite temperature, 496 

Big Crunch, 508-509, 508f, 514 

billiard balls, elastic collision of, 165e 

binary pulsar, emission of gravitational waves, 563 

binary systems, gravitational waves from, 714 

binding energy, gravitational, 455-456 

biological time, actual, elapsed between event A and 
B, 179 

Birkhoff, George: Newton-Jebsen-Birkhoff theorem, 
453; time dependent spherically symmetric mass 
distribution, 373 

black body radiation, of black holes, 436 

black hole hypothesis, historical, 13 

black holes: and 3-D metrics, 77; binary systems of, 
714; charged, 477-484; “dangers of extremes,” 
484; in de Sitter spacetime, 635; definition of, 
410; distance around extremal, 469; dust ball 
collapsing into, 422f; entropy of, 15, 436, 441, 
448, 766, 788n; estimation of “electric” and 
“magnetic” components for, 717; eternal, 421- 
422, 426-427, 479; extremal, 467-468, 478, 481; 
first and second law of thermodynamics, 472-473; 
formation of, 373, 421-423, 422f, 423f, 429-431; 
gravitational potential around, 410-411, 411f and 
Hawking radiation, 14-15; horizon of, 416-417, 
784; information paradox, 439; internal world 
of, 781; just sitting there, 482-483; Kerr black 
hole, 462, 464-468; Kruskal-Szekeres diagram, 
426; as limit for measuring device, 763-764; local 
gravitational field in great distance of, 574; mass 
determination, 570; mass of, given by Michell 
and Laplace, 366; mystery of, 410, 441; orbits for 
light moving around, 416f; orbits with substantial 
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black holes (continued) 
angular momentum, 412-413; particles and light 
around, 409-418; perihelion shift around, 413; 
Reissner-Nordstrém black hole, 479, 483; rotating, 
576; Schwarzschild black hole, 429f, 436; stellar 
collapse into, 455-456; strangeness of, 764-765; 
sub-/transextremal, 478, 483; tilting light cones, 
421; and unitarization of graviton scattering, 765. 
See also finite sized objects; rotating black holes 

blue shift, relativistic, of frequency, 186 

blue sky effect: reason for, 715; squared in gravity, 
717 

bodies, rotating: angular momentum of, 563-577; 
slowly, 570; spacetime deformation by, 460 

Bogoliubov transformation, 448 

boiling vacuum, 437-438 

Boltzmann constant, and temperature concept, 16n 

boosts: invariance of, 188; Lorentz transformation 
for, 169, 187; and rotations, commutation relations 
of, 191-192 

Born, Max, on Einstein’s gravity, 777 

Bose-Einstein condensate, 332 

bosons: bound to magnetic monopoles, 789; as open 
strings, 696 

Boulware, David, curved spacetime, 580 


bounce theory, 536n 

boundaries: of Ad S2, 664; of anti de Sitter spacetime, 
655; divergence of metric tensor, 663; numbers of, 
664 

boundary: of Euclidean anti de Sitter space, 662; 
incoming light beam, with Poincaré coordinates, 
659; spatial, in anti de Sitter spacetime, 649 

boundary conditions, of energy functional, 116 

bowl: curvature at bottom of, 85; potential energy of 
moving marble, 113-115 

box: accelerating, laser light, 281f, 283f; Lorentz 
contraction of, 23; particles in, 223f, 227; for 
studying physical systems, 649 

Boyer-Lindquist coordinates, 476; description of flat 
space, 78 

brachistochrone problem, 121f; formulated by 
Bernoulli, 120 

Brahe and Kepler, work of, importance for Newton, 
369n 

brane worlds, 696-707 

branes: 1-brane model, 702; 2-brane model, 700-702; 
initially static, 707; Poincaré invariant, 707; waves 
from bulk, 703f. See also membranes 

breathing circles, 679-680 

Bright, Ms. (limerick character), 294 

Broglie, Louis de, 773n; particle-wave dualism, 762 

Broglie wavelength, particles at Schwarzschild 
radius, 442 

Bronstein, Matvei, reconstruction of theory of gravity, 
764-765 

Buchdahl’s theorem, 454 

bulk waves, to brane, 703f 


Calabi-Yau manifolds, 695 

calculus: simplification of, auxiliary quantities, 129; 
of variation (see variational calculus) 

caloric, historical concept of, 786 

Calvino, Italo, Cosmicomics, 554 

candles: falling, 268, 271; standard, 359 

Carl Friedrich Gauss, and differential geometry, 
90-91 

Carroll, Lewis: constant notion of time, 173n; on 
times, 166 

Cartan, Elie, and Lie algebra, 586 

Cartan’s equations: anti de Sitter spacetime, 612; 
first, index transformations, 603; for maximally 
symmetric 3-spaces, 610; in spherically symmetric 
static spacetimes, 611; structural, 607, 684 

Cartan formalism: calculation of curvature, 602; 
curvature and covariant derivative, 605 

Carter-Penrose diagrams, 435. See also Penrose 
diagram 

Cartesian coordinates: change to polar coordinates, 
29, 62, 71; change to spherical coordinates, 63 

Casimir effect, 748-749, 758n 

Cauchy horizon, 404 

Cauchy problem, in Einstein gravity, 400 

Cauchy surface, initial data on, 402 

Cauchy’s theorem, for analytically continuing 
integrands into complex plane, 732 

causal structure, of de Sitter spacetime, 638, 639f 

causal structure of spacetime: domains, 530, 531f; 
Hawking radiation, 438; Penrose diagrams, 427, 
431 

causality, 178; apparent violation of, in brane models, 
703, 705; as fundamental principle of theoretical 
physics, 783; at Schwarzschild radius, 421; in 
special relativity, 204 

Cavendish, Henry, measurement of Newton’s 
constant, 32 

Cavendish experiment, and non-quantized gravity, 
771 

celestial mechanics, Newton’s solution of, 28-30 

censorship, cosmic, 479-480 

center-of-mass energy, graviton scattering, 761 

central forces, in celestial mechanics, 28 

central potential, and invariance, 47 

centrifugal force, 278; around black holes, 411; and 
curvature of curve, 97 

“centrifugal” potential, 126 

CFT (conformal field theories), 649n 

chain rule, transformation of Christoffel symbols in, 
132 

Chandrasekhar, Subrahmanyan, Kerr solutions, 481 

Chandrasekhar limit, 455 

Chang Heng, and concept of coordinates, 62n 

charge: conjugation, 678; conservation of, 
during antimatter creation, 205; coupling to 
electromagnetic field, 250; density of, in Maxwell’s 
equations, 252; Lorentz force law, 404; and 


momentum in fifth dimension, 677; notion of, 
246-247; quantization in Katuza-Klein theory, 677 

charged black holes, 477-484; Penrose diagram, 480f 

charged particles, individual, worldlines of, 715 

charged scalar fields, in 5-dimensional theories, 687 

Chern-Simons term: in (2+1)-dimensional spacetime, 
721; powers of derivatives, 722 

Chinese, and concept of coordinates, 62n 

Christoffel 1-form, definition of, 604 

Christoffel symbol, 129; brute force transformation 
of, 329; and comoving coordinates, 290; and 
covariant differentiation, 321; and curved 
spacetime, 278; definition range in parallel 
transport, 544; in Fermi normal coordinates, 
560; indices, number of, 131; introduction 
of, 99; schematic form of, 342; around 
spherically symmetric mass distribution, 310— 
311; transformation of, 132, 389; use of symmetry 
properties in Fermi normal coordinates, 561; 
variation of, 347 

circles: breathing, 679-680; of constant 
latitude/longitude, on sphere, 105; mistaken for 
points, 674f 

circular orbit: around black holes, 413-414, 413f; 
innermost stable, 414, 474; around massive object, 
549 

“classical” differential geometry, 96-109 

classical field theory, 119; harmonic oscillator in, 
361 

classical gravity, puzzling, 784 

classical mechanics, without Newton’s equation, 145 

classical physics, profound difference from quantum 
physics, 360-361 

classical relativity, not consistent with quantum field 
theory, 773n 

classicalization of gravity, 766 

clock paradox, 194n 

clocks: cosmic, 504; invented by Einstein, 166; 
observed in different frames, 196f; and rulers, 
role in physics, 719-720; slow running, in special 
relativity, 197 

closed curved space, 681 

closed forms, 604 

closed orbits, verification of, 30 

closed strings, 696 

closed timelike curves, 484 

closed universe, 296-297, 491; critical density, 497— 
498; as de Sitter spacetime, 630; Einstein's field 
equations, 493-494; with positive cosmological 
constant, 633 

clothed singularities, 479 

Cohen, I. Bernard, visit to Einstein, 267 

coincidence problem, 499, 778 

collapse: dissipative, 520-521; stellar, 455-456 

collisions: elastic, of billiard balls, 165e; of particles, 
219-220, 438; of photons and electrons, 222f 

column vectors, notation of, 45 


Index | 825 


common sense, to be abandoned for development of 
physics, 784 

“common to all the things contained in it,” 18n 

communication, in expanding universe, 293-294 

commutation: and group theory, 49; of matrices, 41 

commutation relations, between boosts and 
rotations, 191-192 

commutators: between A and B, definition, 49; 
computation of, cyclic substitution, 50; index-free 
representation of vector fields, 319; introduction 
of, 340; and Lie derivative, 328; of two covariant 
derivatives, 325, 341 

comoving coordinates, 290, 298; preferred flow 
direction in, 230 

comoving observers: and perfect fluids, 229; 
spacetime distance of, 174; in universe filled with 
perfect fluid, 492-493 

compact source approximation, 568 

compactification, of extra dimensions, 683 

completion: and promotion, of gravitational fields, 
218; relativistic, 242-243 

complex matrices, and twistors, 730 

complex parameters, rescaling of, 733 

complexification, of variables, 732 

Compton mass, of universe, 747-748 

Compton scattering, 222f: inverse, 235e 

computational effort, by using action principle, 141 

condensed matter physics: dynamical critical 
exponent, 657n, 754, 758n; gauge potential of 
solid state structures, 721; scale and conformal 
invariances, 621 

conformal algebra, 614-623; flat spacetime, 615; 
generators of, 617; identification of, 618 

conformal coordinates: for anti de Sitter spacetime, 
654, 654f; for de Sitter spacetime, 638 

conformal equivalence, of anti de Sitter spacetime, 
654 

conformal field theories (CFT), 649n 

conformal flatness: of anti de Sitter spacetime, 662; 
of de sitter spacetime, 641-642 

conformal generators, 615 

conformal groups, equality with isometry groups, 
656 

conformal Killing condition, 614 

conformal time, and cosmic time, 632 

conformal transformations, 614; generators of, 644; 
preservation of angles, 620; as solutions of Laplace 
equation, 616 

conformally equivalent spacetime, 311 

conformally flat metrics, 352e-353e; definition of, 94 

conformally flat space, 80-81e; as bad terminology, 
94 

conformally related spacetimes, 622e 

conjugation, charge, 678 

connection 1-form, 599-600; indices of, 607 

conservation: angular momentum (see angular 
momentum); charge, 205; covariant, of energy 
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conservation (continued) 
momentum tensor, 384; current, 226, 253; energy 
(see energy conservation); energy momentum (see 
energy momentum conservation); momentum 
(see momentum conservation); and relativistic 
fluid dynamics, 233; and symmetry, 150-155 

conservation laws, 155; from action principle, 141; 
and Killing vectors, 589; for motion in curved 
spacetime, 310; in Newtonian mechanics, 35-37 

conserved quantities: in Newtonian mechanics, 30; 
Noether’s theorem, 152 

consistency condition, and determination of 
potential, 36 

constant latitude/longitude, circles of, on sphere, 105 

constant vector fields, covariant derivative of, 331 

constants, fundamental, 12 

constraints on metric, 403 

container: of anti de Sitter spacetime, 649; rectilinear, 
infinitesimal volume of, 80e 

continuity equation, for current conservation, 225 

continuum mechanics, notations of coordinates, 117 

contracted Bianchi identity, 393; derived from 
Einstein-Hilbert action, 394 

contraction: of indices, 46n, 345; of repeated indices, 
58n; of spacetime indices, metric for, 719; tensors, 
316 

contravariant indices, 72, 315 

contravariant vectors, 183 

coordinate differentials, 312 

coordinate invariance: general, 305-306, 672; local, 
in higher dimensional theories, 682 

coordinate patches, to cover entire space, 76 

coordinate scalars, to form a metric, 708-709 

coordinate singularities: compared to physical 
singularities, 91-92; and Einstein-Rosen bridge, 
92f; Kerr black holes, 467; Schwarzschild metric, 
365-366 

coordinate systems: failure of, 76-77; natural, 134 

coordinate transformations: 5-dimensional, gauge 
transformations as, 673; accelerated frames, 
285; change of metric under, 70-71; Christoffel 
symbols, by brute force, 329; in curved space 
and curved spacetime, 317; in differential forms, 
597; freedom of, 62; Galilean, of acceleration, 
276-277; general, 68-71, 312, 314, 318, 384; for 
gravitational waves, similarity to electromagnetic 
gauge transformations, 564; and indices (upper 
and lower), 73-74; and Jacobian, 75; and 
Mercator map, 79e; nonlinear, 69; as passive 
diffeomorphism, 398 

coordinates: angular, 422, 426, 627; Boyer-Lindquist, 
476; change of, 64-65, 641; choose of appropriate, 
631; comoving, 290, 298; concept of, by Descartes, 
48; dimensionless, 665; Eddington-Finkelstein, 
431; effect of motion on, 160; geometric 
significance of, 68; hyperbolic, 661; hyperbolic 
radial, 653-654; internal, 675; Kruskal-Szekeres, 


424-425, 432-433; Kruskal-Szekeres-like, 635; 
light cone, 146-147, 170-171, 427-429, 704; 
locally flat, 130, 132, 278, 288, 552, 557; notation, 
62n, 117; Painlevé-Gullstrand, 417; poor choice 
of, 590; primed and unprimed, 18, 38, 39f, 71- 
73; pseudo-time, 657; relations between different, 
159; Rindler, 446; role exchange of, at horizon, 
419; of specific point, Fermi normal coordinates, 
559; static, 634, 636, 652; time, 652; traditional 
“names” of, 25; warped polar, 613e 

coordinates, “crazy,” 94e 

coordinatization, of de Sitter spacetime, 634 

Copernican principle, 491 

corotation/counterrotation: light rays, 461, 469; 
particles, 474 

correlation: angular, cosmic microwave background 
fluctuations, 474; of quantum fluctuations, 447 

coset manifolds, 590; and classification of space and 
spacetime, 666; group theory of universe, 644; 
and maximal symmetry, 625; and spontaneous 
symmetry breaking, 593 

cosmic censorship, 479-480 

cosmic clock, universe’s ambient temperature as, 
504 

cosmic coincidence problem, and cosmological 
constant paradox, 751 

“cosmic conspiracy,” 297 

cosmic diagram, 496, 502, 503f; flow in, 510-512; 
phase boundaries in, 513-514; stable attractor and 
fixed points, 511f 

cosmic expansion. See expanding universe 

cosmic microwave background, 236e, 517; angular 
correlation of fluctuations of, 523f; density 
fluctuations in early universe, 521-522; first 
acoustic peak, 523-525; fluctuations, effect of 
curvature on, 525-526; temperature, 515 

cosmic potential, 508f; Big Bang analyzed with, 
508-509 

cosmic ray particle, lifetime of, 198 

cosmic time, 295; and conformal time, 632; and 
horizon problem, 530 

Cosmicomics (Calvino), 554 

cosmological action: derivation of energy momentum 
conservation, 387e; variation of, 391 

cosmological constant: for 2-brane model, 701; added 
by Einstein, 360; in cosmic diagram, 502-503, 513; 
in de Sitter spacetime, 456; decaying, 756; and 
deceleration of cosmic expansion, 507; deletion 
of, Feynman diagrams for, 756-757; dependence 
of equation of state parameter on, 359; different 
spatial curvature, 634; Einstein’s field equation in 
presence of, 357; and Einstein-Hilbert curvature 
term, 754; and energy conditions, 557; essential 
role played by, 512; and expanding universe, 392; 
in inflationary cosmology, 534; introduction of, 
356, 393; as Lagrange multiplier for volume of 
spacetime, 756; mass scale of, 700; mystery of, 


356, 711, 751, 782; positive, 633; in quantum 
world, forbidding removal of, 361; and scale factor 
of universe, 496f; scaling of, 753-754; universe’s 
equation of state, 497 

cosmological constant paradox, 745-759; algebraic 
solution of, 754-755; analogies of, 786; anthropic 
principle, 751-752, 757; breaking free of local field 
theory, 756; as challenge for physics, 712; cosmic 
coincidence problem, 751; deeper understanding 
of physics, 753; deletion of Feynman diagrams, 
756-757; effective action for gravity, 711; and 
effective field theory, 782; extreme ultra infrared 
regime, 750-751; inflation, 751; largest and 
smallest masses, 747-748; linkage between 
infrared and ultraviolet regime, 752; naturalness 
doctrine, 749-750; omniscience of gravity, 745; 
Planck mass, 746-747; potential solution of, 
778; quantum fluctuations, 745-746; question 
for explanation of vacuum energy, 752-753; 
unimodular gravity, 755-756; vacuum energy 
density, 749 

cosmological distances, 750; scaling at, 753-754 

cosmological equation, 501; appropriate units, 633; 
and history of universe, 503-504 

cosmological principle, 289, 491-492; Einstein’s field 
equations, 494; Newtonian mechanical analogies 
from, 507, 513 

cosmological redshift, 295 

cosmological time, in outgoing brane wave model, 
706 

cosmology: curvature of universe, 490; gases for, 
230; golden age of, 491; inflationary, 530-536; 
nonlocal, 712; observational, 505; physical history 
of early universe, 515-529; proper distances, 296- 
297; trans-Planckian, 518; use of scalar fields in, 
759n 

couch potato problem: action principle, 143; particles 
at rest, 142 

Coulomb potential, dilation invariance, 620 

counting: for characterizing intrinsic curvature, 110; 
and group theory, 56-57; of matrix elements, 
87-90 

coupled Einstein and Maxwell equations, static 
solutions, 482-483 

coupled ordinary differential equations, relativistic 
stellar interiors, 452 

coupled partial differential equations, transfer of 
spacetimes, 664 

covariance: difference from invariance, 47; general, 
285 

covariant curl, 325; derivation of energy momentum 
tensor, 381 

covariant curvature, 339 

covariant derivatives: along geodesic, 553; in Cartan 
formalism, 605; concept of, and differential 
geometry, 100-101; constructed by parallel 
transport of vectors, 103; Newton-Leibniz rule, 
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failure for, 342; and objects carrying vectorial 
arrow, 109; in parallel transport, 543-544; of 
vectors, 340 

covariant differentiation, 320-333; and Christoffel 
symbol, 321; along curves, 327 

covariant divergence, 326; of tensors, 332 

covariant indices, 72, 315 

covariant vectors, 183 

CP (charge conjugation followed by parity) violation, 
528; in higher dimensional theories, 683 

“crazy” coordinates, 94 

creation of space, 498 

creation operator, 447-448 

critical density, 497-498; and Hubble length, 514; 
ratio of energy density to, 505 

critical phenomena, theory of, 621 

Crommelin, Andrew, and Royal Society expedition, 
367 

cross-product notation, angular momentum 
conservation, 48n 

cross section scattering, 715 

cube, topology of, 725 

cube of physics, 12-13 

cubic vertex, 739 

curl: covariant, 325; exterior derivative, 599; 
relativistic, 252 

curled up space, 673-674 

current: conservation of, 225, 253; in relativistic 
physics, 223; in string theory, 235 

curvature, 667; 5-dimensional scalar, 684-685; and 
acceleration, 554; angular deficits as “measure” 
of, 727; calculated on basis of given metric, 
66; calculated with Cartan formalism, 602, 
605; calculated with differential forms, 607; 
connection with field strength by differential 
forms, 602n; constant of scalars, 589; of curve, 
89, 97; from curves to surfaces, 106; of cylinders 
and spheres, 6; of earth, airline example for 
proving, 66; expressing failure of Newton-Leibniz 
tule covariant derivatives, 342; extrinsic (see 
extrinsic curvature); of “fixed latitude” circle, 
80e; intrinsic (see intrinsic curvature); invariant 
or covariant description of, 339; and least path 
principle, 5-6; measurement of, 89, 547, 548e; 
negative, definition of, 85; Riemann (see Riemann 
curvature); scalar (see scalar curvature); of space, 
65-66; of spacetime (see curved spacetime); spatial, 
effect on CMB fluctuations, 525-526; of surface 
and curves, 104-105; of universe, 490-491, 495, 
748; vanishing, 85 

curvature 2-form, 600 

curvature density, 504, 512; and flatness problem, 
531 

curvature tensor: alternative derivation of, 547-548; 
anti de Sitter spacetime, 651; computation of, with 
symbolic manipulation software, 607; constraints 
on, 591; for de Sitter spacetime, 626; directly 
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curvature tensor (continued) 
from 2-form, 611; Fermi normal coordinates, 560; 
fixed by maximal symmetry, 592; and Hawking 
radiation, 438; and Kerr metric, 476; in locally flat 
coordinates, 553; in maximally symmetric spaces, 
589; use of symmetry properties, 561. See also 
Riemann curvature tensor 

curved rectangle, displacement of vector, 341f 

curved space: and change of coordinates, 64-65; 
closed, 681; compared to curved spacetime, 
91; and coordinate transformations, 68, 317; 
determination of curvature, 65-66; embedded in 
higher dimensional flat spaces, 85-86; sphere as 
example for, 83 

curved spacetime: antisymmetric symbol in, 
723-725; and change of coordinates, 64-65; 
compared to curved space, 91; coordinate 
transformations in, 317; determination of 
Lagrangian in, based on Einstein’s equivalence 
principle, 712; electromagnetism in, 325-326; 
energy momentum tensor in, 380; Euclid’s 
axiom, 552; general, spatial distance in, 290- 
292; geodesic equation for, 277-278; governed 
by energy distribution, 390; governed by 
gravity, 344-346; and gravity, mystery of, 276; 
independence of mass, 258-259; in lab, 332; 
Maxwell’s equations, 333; most appreciated, 
624; motion in, 289-290, 301-311, 307-309; 
Newtonian limit, 302-303; quantum field theory 
in, 780; Raychaudhuri equation, 449; spacelike 3- 
dimensional hypersurface, 693f; around spherical 
mass distribution, metric for, 364; spinors in, 
604-605; universality of gravity, 275-276; universe 
as, 288-300; visualizations, 296 

curved surface: parallel transport of vectors on, 102; 
and tangent plane, 83f 

curves: of constant latitude, 92, 93; curvature of, 
compared to surface, 89n; decomposition of, 545; 
in Euclidean space, 96-97; fear of, 82; in geodesic 
problem, definition of, 123; length in spherical 
coordinates, 127; minimal length of, 128; in 
Minkowskian spacetime, 175; parametrized, 
and parallel transport, 543; reparametrization 
invariance of, 124; on surface, determination of 
curvature, 104-105 

cutoff: concept of, in quantum field theory, 758n; 
instead of infinities in physics, 770 

cyclic substitution, computation of commutators, 50 

cyclic symmetry, of Riemann curvature tensor, 351e 

cylinder: curvature of, 6, 84-85, 107; tangent plane 
of, 98; topological, 654 


D-branes, Bekenstein-Hawking entropy, 444 

d-dimensional Euclidean space, rotations in, 49-51 

d-dimensional sphere, definition of, 624 

d-dimensional anti de Sitter spacetime, definition of, 
650 


Damour, Thibault, on Einstein’s application of 
Lorentz transformation to physics, 190 

“dangers of extremes,” 484 

Dao, of many-worlds interpretation of quantum, 
788n 

dark energy, 359, 627n, 642; coincidence problem 
in dark energy—dominated universe, 499; and 
cosmological constant paradox, 711, 747, 778; in 
de Sitter spacetime, 456; and energy conditions, 
557; and Hubble parameter, 391; mystery of, 
356; and Nobel Prize in Physics (2011), 361n; 
observational evidence for, 503; phase boundaries 
in cosmic diagram, 514; and scalar fields, 788n; 
struggle with dark matter, 502-503 

dark energy density: negative pressure as 
consequence of, 360; in spacetime, 356, 359 

dark matter: gravitational lensing, 370-371; 
observational evidence, 503, 503f, 506; structure 
formation in early universe, 522-523; struggle 
with dark energy, 502-503 

de Broglie, Louis, 773n; particle-wave dualism, 762 

de Donder gauge condition, gravitational waves, 564 

de Sitter algebra, and cosmological constant, 755 

de Sitter horizon, 293, 636f; thermal radiation from, 
637 

de Sitter-Lanczos-Weyl-Lemaitre spacetime, 642 

de Sitter length, inverse of, Hubble constant, 632 

de Sitter metric, history of, 642 

de Sitter precession, 549 

de Sitter spacetime, 456, 624-648, 625f; angular 
coordinates, 627; to anti de Sitter spacetimes, 664; 
causal structure of, 638, 639f; containing black 
holes, 635; d-dimensional, 625f; different forms of, 
629; isometry group of, 625; iterative relationships 
of, 640; Kruskal-Szekeres-like coordinates for, 
636f; Lemaitre—de Sitter metric, generalized, 489; 
maximal symmetry of, 626; preview of calculation 
of, 148; Riemann curvature tensor of, 626; and 
space of spheres, 646; stereographic projection 
for, 641; table for, 643 

decomposition, of groups, 56f; definition of, 56-57 

decoupling: of internal and external geometries, 
691-692; of matter and radiation in early universe, 
516-517 

defining representation, of rotation group, 54 

deflection of light, 368f; by astrophysical objects, 
Soldner’s calculation of, 366-367 

degree, subdivision of, proposal by Ptolemy, 368n 

degrees of freedom, gravitational waves, 564 

degrees of polarizations, gravitational waves, 564 

delay, and radar echo experiments, 372 

delta function. See Dirac delta function 


Denken, before Integration, 133 

density fluctuations: in early universe, 521, 523-525; 
in inflationary cosmology, 533 

density waves, in static relativistic fluid, 234 

derivatives: covariant (see covariant derivatives; 


covariant differentiation); exterior, differential 
forms, 598; as fractions, 207; of functional, 
definition of, 116-117; Lie, 327-328, 331-332, 
555; order of taking, 340; taken with respect to 
functions, 113; two, in Newton’s force law, 110 

Descartes, René: approach to questions in physics, 
583n; concept of analytical geometry, 18; and 
concept of coordinates, 62n; versus Euclid, 48; and 
Euler characteristics, 726; theory of vortices, 578n; 
watching a fly, and concept of coordinates, 51n 

Deser, Stanley: ADM formulation, 693; curved 
spacetime, 580 

determinants: antisymmetric, 236; definition of, 60, 
719; and intrinsic curvature, 84; introduction of, 
39-40; of metric, 215-216; variation of, 381 

DeWitt, Cecile, on “vielbein,” 606n 

Dialogue Concerning the Two Chief World Systems 
(Galileo), 17-18 

diffeomorphism, 398 

differences, infinitesimal, and Galileo 
transformations, 160 

differential equations: coupled ordinary, relativistic 
stellar interiors, 452; solving problems of motion, 
26-27; in variational calculus, 126 

differential forms: applications of, 607-613; 
calculation of 5-dimensional scalar curvature, 
684-685; Cartan’s structural equations, 607; 
Hodge star operation on, 723-725; jargon of, 604; 
in Kasner universe, 613e; language of, 596; and 
magnetic flux, 728; and vielbein, 594-606. See also 
topological entries 

differential geometry: classical, 96-109, 130; and 
concept of covariant derivatives, 100-101; logic 
of, 66; pioneering work of Gauss and Riemann, 
90-91; of Riemannian manifolds, Cartan’s 
formulation of, 599-600 

differential operation, definition of, 598 

differential operators, 48; Killing vectors as, 587; 
shorthand notation for, 72; vector fields as, 319 

differentials: coordinate, 312; manipulations of, 
161 

differentiation: along curves, 327; covariant, 320-— 
333; dot notation, 96; of functionals, 114; of scalars 
and vectors, 318 

dilation: conformal generators, 615; generators of, 
644 

dilation invariance, Coulomb potential, 620 

dilaton field, 680; and calculation of 5-dimensional 
scalar curvature, 686; in outgoing brane wave 
model, 704 

dimensional analysis, 120; of action, 346; to 
determine power of scattering amplitude, 717; 
for effective action of gravity, 711; of graviton 
scattering amplitude, 770; Hawking temperature, 
14-15; scattering amplitude, 761 

dimensions: higher, Poincaré half plane in, 656; 
invisible, 672-673; large extra, 696-707 
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Dirac, P.A.M., and quantization of magnetic flux, 
728 

Dirac action, in Minkowskian spacetime, 605 

Dirac delta function: 3-dimensional generalization, 
119; continuous variables in functional variations, 
122; in electromagnetism, 251; in higher 
dimensions, 698, 701; introduction of, 26-27; 
and Kronecker delta, 36; as limit of a sequence of 
functions, 27f; momentum conservation, 740; and 
smooth functions, 33e; time, 229 

Dirac equation, commutation relations, 192 

Dirac-Feynman formulation. See path integral 
(Dirac-Feynman) formulation 

Dirac large number hypothesis, 778 

Dirac spinors, 604-605 

directional derivative, covariant differentiation and, 331 

discretization, of functional variation, 121 

disks. See accretion disks 

dissipative collapse, 520-521 

distance: of cities, and non-flatness of world, 66f; in 
Euclidean space, 174; in generally curved spaces, 
181; Hubble units, 293; less important than 
angles, 620; luminosity, 298; measurement of, in 
spacetime, 180; minimal between points, 123- 
135; in Minkowskian geometry, 175; operational 
definition of, 291, 291f; of points, 174-175; proper, 
296-297; shortest, 175, 176f, 545; spatial, in 
general curved spacetime, 290-292; of spheres 
in 3-spaces, 610; traversed, during lifetime of 
particles, 198. See also length; path 

distribution, compared to functions, 33 

divergence: covariant, 326, 332; notation in 
various coordinate systems, 78-79; in spherical 
coordinates, 81e; transformation, 321 

divergence theorem, generalized to 4-dimensional 
spacetime, 386 

dome, curvature at top of, 85 

dominant energy condition, 557 

Donder, Théophile Ernest de, application of action 
principle to gravity, 397 

Donoghue, J. F., treating general relativity as effective 
field theory, 773n 

Doppler effect: in accelerated frames, 282; relativistic, 
185-186, 222 

dot notation, Newton’s, 29, 96 

dot product: of four vectors, 182; of vectors, definition 
of, 39 

dots, as symmetry symbol, 129 

“dropped” thought experiment, 280-283, 286 

Droste effect (for pictures), 375 

Droste’s solution, of Einstein’s field equation, 375 

dS/CFT (de Sitter / conformal field theories) 
correspondence, 787 

duality, electromagnetic, 255, 483 

dueling thinkers experiment, 7-9 

dust: cosmological, 387e, 495, 514; technical term, 
421 
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dust ball, collapsing, forming black hole, 422f, 423f 

dynamic universes, 489-501 

dynamical critical exponent, in condensed matter 
physics, 657n, 754, 758n 

dynamical variables, 249; in continuum mechanics, 
117; holding fixed in variation of action, 380; 
independence of, 133; specification of, 395 

Dyson, Freeman: on Einstein’s ideas of a field theory 
of gravity, 119; on Einstein’s saying about vanity, 
777; on loneliness of Einstein, 388; on non- 
quantization of gravity, 768-769; on quantization 
of gravity, 780, 788n 


early universe, 496-497; curvature term, 495; density 
fluctuations in, 521; history of, 515-529; structure 
formation in, 520, 522-523 

earth: center of, and falling apple, 36; density of, 32; 
surface of, shortest path on, 275; theory of hollow 
sphere, 32 

earth-moon distance, accurate measurements of as 
test of Einstein gravity, 366n 

Eddington, Sir Arthur: and Chandrasekhar limit, 455; 
on costs of light, 369; and geometry of universe, 6; 
making Einstein a worldwide celebrity, 369-370; 
Royal Society expedition, 367 

Eddington-Finkelstein coordinates, 431 

edges, in topology, 725-727 

“effect of inertia,” 276 

effective action: for gravity, 711; Weyl’s ansatz, 374 

effective field theory, 782; and concept of action, 
710; Einstein-Hilbert action, 709; general 
relativity treated as, 773n; and graviton scattering 
amplitude, 770; of gravity, 766; low energy, 
711-712 

Ehrenfest, Paul, letters from Einstein: on fears 
of going insane, 355; on perihelion motion of 
Mercury, 368 

eigentime, 179f 

eigenvalues, of matrix, usual determination of, 106 

eigenzeit, 179f 

Einstein, Albert, 150; and action for gravity, 339; 
anger at nostrification of his theory of general 
relativity, 396; annus mirabilis, explanation of 
light, 213; as classical physicist, 360; confusion 
concerning the metric, 404; E = mc?, 209, 220- 
221, 232; 233f; early work, lack of vector notation, 
46n; equivalence principle, 271; ether detection, 
experimental set-up for, 163; factor-of-2 error, 367, 
370; field equation, 348f; gedanken experiments, 
on simultaneity, 7-9; on going beyond space and 
time, 787; greatest blunder, 393; and Grossmann, 
paper on variational principle for gravity, 396; 
happiest thought of life, 265, 278, 302; “hole 
argument,” 404; on influence of philosophers, 
159; invention of Palatini formalism, 397; legacy 
to physics, 253-255; letter from Schwarzschild, 
362; letters to Ehrenfest, 355, 368; letters to 


Katuza, 693-694; letters to Sommerfeld, 344, 366, 
580; longing, 337; and Lorentz, 168; on magic of 
relativity theory, 195; mathematical elegance of 
his theory, 777; on mysteries, 778; old man’s toy, 
267; penance, 500; on pure thought, 172; repeated 
index summation (see summation convention); 
“second greatest blunder,” 509-510; separation 
from his wife, 399; and Soldner’s calculation, 
366-367; stars made of nothing, 456; static 
universe, 509-510, 514; summation convention 
(see summation convention); understanding of 
gravity, equality of inertial and gravitational mass, 
28; unfinished symphony, ripples in spacetime, 
563 

Einstein’s clock, 166-173; in different frames, 
167£ 

Einstein convention, in general relativity, 314 

Einstein’s equivalence principle, determination of 
Lagrangian in curved spacetime, 712 

Einstein’s field equation: 3-factor, and metric 
tensor formalism, 76; 5-dimensional, for 2- 
brane model, 700; acceleration or deceleration 
of cosmic expansion, 506-507; anti de Sitter 
spacetime, 651; for charged black holes, 477; for 
closed/open/flat universes, 493-494; coupled 
to Maxwell’s equations, static solutions, 482— 
483; de Sitter spacetime, 627; derived by Palatini 
formalism, 395; determination of, 347-349; 
Droste’s solution of, 375; easy solutions to, 
557; Einstein’s search for, 341-342; for empty 
spacetime, 347-348; flipping between spacetimes, 
664-665; Kerr solutions on, 464; in Minkowski 
metrics, 563; modified by Arkani-Hamed, 754; 
non-determinism of, 403; nonlinearity of, 400; in 
post-Newtonian approximation, 577; in presence 
of cosmological constant, 357; for relativistic 
stellar interiors, 451; result of derivation of, 390; 
role of two powers of spacetime derivative, 402; 
solving, 358; and spacetime thermodynamics, 
448-449; time-time component of, 498; traceless 
part of, 755; vacuum, 647e; variation of, 350 

Einstein gravity: from ambitwistor representation, 
739; connection to Yang-Mills theory, 782; cube of 
physics, 13f; discord with quantum physics, 768— 
769; features of, 777; replaced by something more 
fundamental, 785; roads leading to, 578-584 

Einstein-Hilbert action: alternative form of, 
397; cosmological action added to, 356; and 
cosmological constant, 712, 754; derivation of 
contracted Bianchi identity, 394; and differential 
forms, 725; and effective field theory, 782; effective 
field theory approach, 709; finding of, 344-346; 
general invariance of, compared to Maxwell action, 
394; graviton coupling, 582; higher dimensional, 
681; in Katuza-Klein theory, 675; low dimensional 
terms, 782; quantum gravity limit, 444; things 
unknown to, 789n; and twistors, 739; variation of, 


388; weak field action without, 572; and Weyl’s 
ansatz to Schwarzschild solution, 374 

Einstein-Hilbert Lagrangian, determination of 
Einstein’s gravity, 581 

Einstein potential, compared to Newtonian potential, 
planetary orbits, 371 

Einstein-Rosen bridge, 433; and coordinate 
singularities, 91-92, 92f 

Einstein tensor, 388; as result of variation of 
Einstein-Hilbert action with respect to metric, 
394 

Einsteinian mechanics, and cube of physics, 13f 

elastic string: as example of variational calculus, 113; 
hanging under force of gravity, 114f 

electric charge: role for photon, 383. See also charge 

electric dipole moment, of atom, and action, 715 

electric field, 245; relativistic unification, 247 

electrodynamics: initial value problem in, 404. See 

also Maxwellian electromagnetism 

electromagnetic action, 244, 250-251; in expanding 

universe, 333; local, 246. See also Maxwell action 

electromagnetic coupling constant, 767 

electromagnetic current: conservation of, 225; as flat 

space analogous to energy momentum tensor, 

379; relativistic form of, 226 

electromagnetic duality, 255, 483 

electromagnetic field: as collection of infinite 
number of harmonic oscillators, 382; coupling to 
charged particles, 250; derivation of equation of 
motion, 385; determination of, 338, 342n; energy 
density of, 255; energy momentum tensor in, 381; 
in Katuza-Klein theory, 691; and Lorentz vector 
potential, 244; Maxwell action in Minkowskian 
spacetime, 381; and Maxwell’s equations, 252; 
motion in curved spacetime, 301; and mystery 

of light, 162; at particle position, 246; treated as 

superposition of modes, 746 

electromagnetic field tensor, 244; dual, 255; gauge 

invariance, 249; relativistic curl of a 4—vector, 

252 

electromagnetic gauge transformations, similarity to 

coordinate transformations, 564 

electromagnetic interaction, compared to 

gravitational interaction, 768 

electromagnetic potential, in fifth dimension, 677 

electromagnetic waves: cross section for scattering on 

atom or molecule, 715; momentum of, derivation 

of Einstein’s formula, 232 

electromagnetism: 4-dimensional, 720-721; in 
curved spacetime, 325-326; described by 
differential forms, 598; finite sized objects in, 
714-715; fixed gauges, 564; in flat spacetime, 382; 
gauge invariant derivative in, 342, 353n; Maxwell’s 
laws of, and Galilean transformation, 20; 
restrictions imposed on by Lorentz symmetry, 339; 
role of signs in, 382; similarities to gravitational 
waves, 568; from special relativity, 244-246; theory 
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of, development of, 253; unification with gravity, 
674-676 
electromagnetism analogy, Einstein’s search for the 
metric, 404 
electrons: collisions with photons, 222f; degenerate, 
455; delayed recombination in early universe, 
516-517 
electrostatics, mathematical treatment leading to 
Maxwell, 582 
electroweak interaction, 527, 765; in M versus R plot, 
14f 
elementary particles, masses of, 16n 
elementary physics, definition of mass, 213 


elementary scalar fields, 759n 

embedding: of curved spaces in higher dimensional 
flat spaces, 85-86; of surface, determination of 
curvature, 90 

embedding space, geodesics in, 645 

Emerson, Ralph Waldo, dictum of, 235n 

empty spacetime: Einstein’s field equation for, 
347-348; gravity in, 362 

energy: dark (see dark energy); elastic, of hanging 
string, 113-114; exact meaning of, 383; extraction 
from Kerr black holes, 470; of membrane, 
rotational invariance, 118; not conserved, 27; 
search for minimization function, 114; spatial 
density of, 228. See also gravitational energy 

energy conditions, 557 

energy conservation, 26, 153; historical 
considerations, 387n; around rotating black 
holes, 459; in static isotropic spacetime, 310 

energy density: constant, filling universe, 356; 
electromagnetic field, 255; in flat spacetime, 
382; ratio to critical density, 505; replacing mass 
density, in Newtonian gravitational potential, 
379n; as rotational scalar, 226-227; and scale 
factor of universe, 496f; of universe, 359, 504; 
vacuum, 749. See also dark energy density 

energy distribution, governing curvature of 
spacetime, 390 

energy functional: boundary conditions, 116; of a 
membrane, 118; minimization for Newtonian 
gravity, 119 

energy level splitting, inverse of, at cosmological 
time scale, 768 

energy momentum, role for graviton, 383 

energy momentum conservation, 227; and Bianchi 
identity, 393; derivation by using cosmological 
action, 387e; and general invariance of matter 
action, 383-384; in gravitational field, 386 

energy momentum pseudotensor, 386 

energy momentum tensor: assumptions about, 

557; called stress energy tensor, 228; covariant 

conservation of, 384; curved spacetime 

generalization of, 380; in electromagnetic field, 

381; of electromagnetic field, tracelessness, 381; 

under Lorentz transformation, 226-227; “new 
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energy momentum tensor (continued) 
and improved,” 712; of perfect fluids, 230, 492; 
for scalar action, 387e; sign considerations, 
380; slowly rotating bodies, 570; as source for 
gravitational field, 379; total, disappearance of, 
394; and variation of Maxwell action with respect 
to metric, 394. See also stress energy tensor 

energy per unit mass, as conserved quantity, 30 

energy scale: grand unified theory, 767; introduced 
by gravity, 770 

energy variations, calculated, 115 

entanglement: and mysteries of quantum mechanics, 
789n; and quantum gravity, 771 

entropy: Bekenstein-Hawking, 441-442, 444; of 
black holes, 15, 436, 441, 448, 788n; lack of 
knowledge behind horizon, 648n; Penrose process 
area theorem, 472; per particle, relativistic fluid 
dynamics, 234; in spacetime, mystery of, 234; of 
universe, 527 

Eot-Wash group, 260 

equation of motion, 155; by action principle, 146; 
near boundary, 665; electromagnetism, 245-246, 
385; energy conservation of, 153; generalized 
for particles under external force, 190; particles 
in potential of, 135; for universe, 357; from 
variational principle, 137 

equation of motion approach, to Einstein gravity, 396 

equation of state: of ideal gas, 231; of universe, 359, 
494, 496-497 

equations: E = mc? (see Einstein); as expression of 
physics, 47; versus identities, 403. See also specific 
equations 

equator, length of, squashed sphere, 80e 

equilibrium, hydrostatic, relativistic stellar interiors, 
453-454 

equilibrium macrostates, Bekenstein- Hawking 
entropy, 441, 444 

equipartition theorem, Planck and, 789n 

equivalence principle, 271; and definition of 
energy momentum, 386; falling living room as 
example, 265-266; and general covariance, 286; 
motion in curved spacetime described by, 302; 
nonimpossibilty of deleting Feynman diagrams, 
756-757; old man’s toy, 267; predictions of, 
280; and relativistic stellar interiors, 451; and 
symmetry, 317-318 

ergoregion, 467, 469-471 

escape: from black hole, 427, 483; from gravity, 
nonimpossibility of, 717n 

escape problem: in Katuza-Klein theory, 673-674; 
with large extra dimensions, 696-697 

eternal black holes, 421-422; Kruskal-Szekeres 
diagram, 426-427; Reissner-Nordstrém, 479 

ether: detection experiments, 163; as dynamical 
variable, 783 

Euclid: and curves, 189; versus Descartes, 48; famous 
axiom, and curvature of spacetime, 552; on the 


non-existence of royal road to geometry, 42; 
shortest path between two points, 4 

Euclidean anti de Sitter space, boundary of, 662 

Euclidean ball, 663; boundary of, 664 

Euclidean geometry: flat, 6; rotation invariance of, 

190; specification of, 175 

Euclidean group, as symmetry group of physics, 755 

Euclidean metric: in hyperbolic spaces, 93; inducing 

curved space metric, 86; locally flat, second order 

corrections to, 88; for spaces of any dimension, 87 

Euclidean plane, conformal Killing vector fields, 

623e 

Euclidean space: 2-dimensional, definition of, 41; 
curves in, 96-97; d-dimensional, 42, 49-51; 
described with different coordinates, 62-63; 
distance in, 174; as example for Killing vector 
fields, 587; object analogs in twistor space, 742; 
paths lengths in, 190; surfaces in, 98-109 

Euclidean thinking, trap of, 180 

Euler, Leonhard, variational calculus, 120 

Euler characteristic, 725-726 

Euler equation: in fluid dynamics, 164; relativistic 
and fluid dynamics, 234 

Euler-Lagrange action, for material particles, 207 

Euler-Lagrange equation, 116; action principle, 138; 
fields, 119; multiple unknown functions, 123; 
simplification of, Poincaré half-plane, 133 

events: coordinates of, in discussion of simultaneity, 
200; definition of in spacetime, 177; horizon 
of, 293, 536; of pole in the barn problem, 203; 
separation of, 160, 166; spacetime locations, 195; 
and worldlines, in special relativity, 195 

expanding universe: acceleration or deceleration of 
expansion, 499-500, 506-507; closed/open/flat, 
494, 497-498; communication in, 293-294; 
curved, 489; de Sitter spacetime, 456-457, 627, 
630; with differential forms, 608; distances 
in, 292-293; earth-moon distance not growing 
because of, 289; electromagnetic action in, 333; 
expansion rate, discovery of, 359; exponentially 
expanding, 293-294, 357-358, 631, 642-643; and 
Hubble, 500; light cones in, 294, 294f; metric 
tensor of, off-diagonal components, 292; and 
positive cosmological constant, 392; without 
Einstein’s field equation, 645. See also universe 

expansion parameter, determination of, 556 

exponential, of matrix, 41 

exponential function, and rotations, 41 

extended objects, motion of, 714 

extensive quantities, 441 

exterior derivative, 599; differential forms, 598 

external forces, influencing motion in curved 
spacetime, 301-302 

external potential, translation invariance of, 242 

extra dimensions, large, 696-707 

extraction of energy, from Kerr black holes, 470 

extremal black holes, 467-468; charged, 478, 481; 


“dangers of extremes,” 484; distance around, 469; 
first and second law of thermodynamics, 473; just 
sitting there, 482-483 

extreme ultra infrared regime, 786; and cosmological 
constant paradox, 750-751 

extremizing a function, with constraint, 109 

extremum, determination of type, 117 

extrinsic curvature, 5-6; defined by Gauss, 107; 
and matrix eigenvalues, 84-85. See also intrinsic 
curvature 


faces, in topology, 725-727 

fall: through event horizon, 412; into rotating black 
holes, 463-464, 470, 472 

falling apple: from hanging string to, 137f; and Isaac 
Newton, 268 

falling living room analogy, 265-266 

families, of quarks and leptons, 786 

family problem, mystery of, 7 

far field, of gravitating system, 576 

Faraday, Michael: conception of scientists, 9n; flux 
picture, 697; and magnetic flux lines, 728 

fate of universe, 507-509 

Fermat, Pierre, controversy over birth year, 136n 

Fermat’s least time principle, 4; as analog to 
Einstein-Hilbert action, 789n; teleological flavor 
of, 136 

Fermi, Enrico, theory of weak interaction, 765 

Fermi normal coordinates: locally flat, 558f metric 
in, 559, 561; motivation of, 557 

Fermi pressure, and Chandrasekhar limit, 455 

Fermi-Walker transport, 193e 

fermions: fundamental, 683; as mystery of physics, 
781-782; as open strings, 696 

Feynman, Richard P., 145; curved spacetime, 580; 
“shut up and calculate,” 445 

Feynman diagrams: of antimatter, 206f; for 
cosmological constant, deletion of, 756-757; 
for gluon scattering, 735-736, 738; for graviton 
scattering, 738; and worldlines, 237n 

Feynman’s path integral formalism. See path integral 
(Dirac-Feynman) formulation 

Feynman’s path to rescue a drowning girl, 3-4, 4f 

Feynman propagator, for graviton, 573 

fictitious forces, 278-279 

field equations: in Minkowski metrics, 563; 

Nordstrém’s theory, 579. See also Einstein’s field 

equation 

d strength: connection with curvature, 602n; 

relation to electric and magnetic fields, 382 

ield theory: classical, 119, 361; quantum (see 
quantum field theory); topological, 719-728 

fields: conceptual jump from many particle case, 

400; to describe universe, 384; notion of, 119; and 

particles, 145-146; understanding of, 783 

fine structure constant, 767 

“finger of God” problem, 703, 705 
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finite sized objects: in electromagnetism, 714— 
715; in gravitational field, 716-715; scattering 
amplitude for gravitational wave, 717; sensitivity 
to variations, 716; and tidal forces, 716-717. See 
also black holes 

Finkelstein, David, Eddington-Finkelstein 
coordinates, 431 

first acoustic peak, 523-525; effect of curvature on, 
525f 

first law of thermodynamics: black holes, 472-473; 
and pressure of universe, 360n 

first order formalism for gravity, Palatini formalism, 
395 

first stars, 519 

“fixed latitude” circle, curvature of, 80e 

fixed points, in cosmic diagram, 511f 

flame, of falling candle, 268, 271 

flat coordinates, locally, 130, 132; as trick in variation, 
389 

flat plane, curvature of, 105 

flat space, 65; conformally, 80-81e; description by 
Boyer-Lindquist coordinates, 78; and everyday life, 
82-83; metric, 77 

flat spacetime: with conformal algebra, 615; 
electromagnetism in, 382; Minkowskian, 
governing action of, 581; twistors in, 729-745 

flat universes, 296-297, 491; age of, 513; critical 
density of, 497-498; curvature effect on CMB 
fluctuations, 526; Einstein’s field equations, 493— 
494; observational evidence for, 505; stability of, 
512 

flat world, 88 

Flatland (Abbott), 671 

flatness, local. See local flatness 

flatness problem, 531 

floor, rushing up to meet apple, 270f 

flow: in cosmic diagram, 510-512; described by 
geodesics, 556; going with the, 328 

fluctuations: of density in early universe, 521, 523- 
525; in inflationary cosmology, 533; quantum, 
436, 446-447, 533 

fluid dynamics: Euler equation, 164; Galilean 
invariance of, 164; symmetry approach to, 164 

fluids: incompressible, 454; motion of, 230, 
556; perfect, 229, 451, 492-493; 704-705; as 
visualizations of vector fields, 327 

flux picture, Faraday’s, 697 

fly in car, velocity of, 162-163 

foamlike structure, of universe, 754, 758n 

foliation: Katuza-Klein theory as, 689-690; spherically 
symmetric mass distribution, 305-306 

force: central, 28, 36; external, influencing motion in 
curved spacetime, 301-302; fictitious, 278-279; as 
function of space, 26; as function of time, 26-27; 
per unit area, stress as, 228 

forms, closed, 604 

Fourier analysis, and inverse square law, 697-698 
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Fourier space, gravitational field in, 758n 

Fourier transformation, of scattering amplitude, 736, 
740 

fractional quantum Hall effect, fluids, 789n 

frame dragging, 460-461, 465-466; deformed 
by rotating body, 460f; etymology, 476n; with 
Lense-Thirring precession, 550 

frame field, 606n 

frames. See reference frames 

free fall, into rotating black holes, and first and 
second law of thermodynamics, 472. See also fall 

free Maxwell’s equations, 251 

free particles, 302; action of, 143, 162; motion of, 
180; noninteracting, 221 

“free” variables, in variational calculus, 116 

freely falling observers, metric for, 561 

Frenet-Serret equations, 97 

frequency, seen by different observers, 185 

frequency dependence, of scattering of gravitational 

wave (or graviton), 717 

frequency shift: relativistic, 186; in relativistic 
Doppler effect, 222 

Freundlich, Erwin, solar eclipse expedition to 
Crimea, 370 

Friedmann, Aleksandr, 501 

Friedmann-Robertson- Walker universes, 296, 491; in 
outgoing brane wave model, 704 

Frost, Robert, and mass density transformation, 579 

functional derivative, definition of, 116-117 

functional variation, 114-115; alternative approach, 

121-122 

functionals: energy, of a membrane, 118; general, of 

multiple functions, 123; notation of, 114 


functions: and distributions, 33; variational calculus, 
115. See also specific functions 

fundamental constants, three needed, 12 
fundamental equations, on glass windows, 138 
fundamental fermions, 683 

fundamental interactions: action principle 
description of, 141; unification of (see grand 
unified theory) 

fundamental principles, 12 


fundamental representation, of rotation group, 54 
funnel analogy, misleading for black holes, 432 
fusion, nuclear, compared to accretion disk radiation, 
415 

future light cone, 177-178; particle movement in, 
178f 


galaxies: formation in early universe, 519-520; 
forming of, and anthropic principle, 757; as 
masspoints on geodesics, 554. See also universe 

“galaxy far far away,” 241-246 

Galilean invariance, and fluid dynamics, 164 

Galilean limit, of past light cone, 179f 

Galilean transformation, 18-20, 19f, 159-160; 
accelerated frames, 276-277; modifications of, 


independence from observer, 168; necessary 
modification of, 166; observed velocities, 161 

Galileo: brachistochrone problem, 120; and free fall, 
268; law of acceleration, 140; versus Maxwell, 159; 
relativity principle, 17-19, 159; vision on flying of 
butterflies, 19f 

Galison, Peter, and special relativity theory, 18n 

Gamow, George, 177; and Einstein’s great blunder, 
393n; stars made of nothing, 456 

Gamow principle, 515-529; understanding of 
cosmology, 778 

gases: for cosmology, 230; nonrelativistic, 231, 454; 
relativistic, derivation of speed of sound, 235 

gauge: harmonic, 564; transverse-traceless, 565 

gauge condition, harmonic, 573 

gauge connection, 602 

gauge fields, emerging from lattice Hamiltonians, 
787 

gauge freedom, and initial value: in Einstein gravity, 
402; in Maxwell electromagnetism, 401-402 

gauge/gravity duality, 649 

gauge invariance, 248-250; in Katuza-Klein theory, 
672 

gauge invariant derivative, in electromagnetism, 342, 
353n 

gauge potential: of 2-dimensional solid state 
structures, 721; as dynamical variable, and energy 
momentum tensor, 381; and spinor fields, 789n; 
Yang-Mills, 682, 688 

gauge symmetry, local, in higher dimensional 
theories, 682 

gauge theories: and anthropic principle, 757; 
nonabelian, decoupling of geometries, 692; 
(non)abelian, 681; topological terms in, 720-721 

gauge transformations: as 5-dimensional coordinate 
transformation, 673; similarity to coordinate 
transformations, 564; strong gravitational sources, 
575 

Gauss, Carl Friedrich: determination of curvature of 
space, 65, 104-105; Theorema Egregium, 90-91 

Gauss-Bonnet theorem, 727 

Gauss’s equation, definition of, 99 

Gauss’s law: and evolving time, 402; and gauge 
theory, 401 

Gaussian normal coordinates, 298 

gears, function of, 109n 

gedanken experiments: “accelerated” /“dropped,” 
280-283; by Einstein, 166; by Galileo Galilei, 
269 

Gell-Mann, Murray: on quantization of gravity, 583n; 
what is not taboo is a commandment, 361n 

general coordinate invariance, 305-306; 
determination of action for gravity, 344, 
346; in Katuza-Klein theory, 672 

general coordinate transformations, 312, 318; 
invariance of physics under, 403 

general covariance, 285 


general curved spacetime, spatial distance in, 
290-292 

general invariance: of Einstein-Hilbert action, 
compared to Maxwell action, 394; of matter action, 
and conservation of energy momentum, 383-384 

general relativity: abstract of, 20; as effective field 
theory, 773n; and Hamilton’s principle, Lorentz’s 
paper on, 397; modifications with respect to 
horizons, 784; solar system tests, 309; tensors in, 
312-319. See also gravity 

generalized uncertainty principle, 769 

generators: breakup into subgroups, 663; of 
conformal algebra, 615, 617; of Lie algebra, 49, 51; 
of rotation, 192; of rotation group, 40; of SL(4, R) 
group, 737, 739; of SO(3) group, 44 

genus, in Euler characteristics, 726 

GEO600, gravitational wave detector, 577n 

geodesic deviation, 552-561, 554; and Lie derivative, 
555 

geodesic equation, 128; alternative derivation of, 130; 
Christoffel symbols of, 129; comoving coordinates, 
298; curved spacetime, 277-278; invariance on 
rescaling, 559; motion in curved spacetime, 
289; and parallel transport, 545; presence of 
external forces, 301; rotating black holes, 459; 
transformation of Christoffel symbol, 330-331 

geodesic problem: free parameters, 124f; variational 
calculus, 123 

geodesics: at black holes, 426-427; collections of, 
554; congruence of, 554; covariant derivative, 
553; determination on Poincaré half plane, 133; 
distance of two nearby, 552, 553f; in embedding 
space, 645, 665; family of, 134; geometric 
construction of, 133n; intersection of, 134; 
lightlike, 292, 646; on Poincaré half plane, 134f; 
separation of, 552; on spheres, 127; timelike, 645 

geodetic precession, 549 

geometrical entities, in topology, 725-727 

geometrical view: of Katuza-Klein theory, 691-693; of 
special relativity, 582 

geometrodynamics, in higher dimensional theories, 
693 

geometry: analytic, role of coordinates, 48; 
conversion factor between physics and, 211; 
covariant derivative from, 323; dynamics of, 
in higher dimensional theories, 693, 693f; and 
invariance, 42-43, 48; of Minkowski spacetime, 
174-193, 238; non-existence of royal road to, 
Euclid’s remarks, 42; of points, isometry, 585; of 
relativistic point particle action, 210; of rotation 
groups, 191; and significance of coordinates, 
68; of a world, 6. See also differential geometry; 
Riemannian geometry 

Ghostwriter, The (Roth), 254 

Gibbons, Gary, discovery of Hawking radiation, 449n 

Gibbons-Hawking radiation, 638; mystery of, 637 

Gibbons-Hawking-York boundary term, 399n 
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global character of space, versus local, 76-77 

global positioning system (GPS), 287, 291 

globe, curves of constant latitude on, 92 

gluon scattering: amplitudes for, 785; Feynman 
diagrams for, 735-736; in terms of abitwistors, 
738; in terms of helicity spinors, 735-736 

gluons: in brane models, 696; in early universe, 526 

GMT (Greenwich Mean Time), 133n 

God: existence of, 520; “What is greater than God?” 
puzzle, 789n 

Goldberger, Murph, on his aunt, 321n 

golden age of cosmology, 491 

“golden” guiding principle, in theoretical physics, 
338 

Gordon, Walter, Klein-Gordon equation, 694 

Grace, Louis, constructor of old man’s toy and of war 
chariot, 267 

graceful exit problem, 534 

gradient: definition of, 61; notation of, 54; 
transformation of, 320 

grand unification, mystery of, 527n 

grand unified theory, 527; and charge, 786; in early 
universe, 518; energy scale, 767; and higher 
dimensional theories, 681; and Katuza-Klein 
theory, 672; in M versus R plot, 14f; magnetic 
(anti)monopoles, 532 

Grassmann variables, 606n; and supertwistors, 739n 

gravitating system, far field of, 576 

gravitation, field equation for, Einstein’s search for, 
341-342 

gravitation law, Einstein’s belief of inconsistency 
with principle of causation, 404 

gravitational collapse, of spherically symmetric dust 
cloud, 373 

gravitational constant, 11; time dependent in brane 
models, 707 

gravitational coupling, 768 

gravitational energy, 580-581; binding, 455-456; of 
hanging string, 114 

gravitational field: classical, quantum particles 
in, 771; completion and promotion of, 218; 
conservation of energy momentum in, 386; 
determination of, 338; dynamics of, 146; and 
equivalence principle, 271; finite sized objects in, 
716-715; in Fourier space, 758n; in great distance 
of black hole, 574; momentum of, 580-581; nature 
of, 218-237, 231; quantization of, 582; strong 
stationary source, 574; as tensor field, 231 

gravitational field limit, Newtonian gravity as, 391 

gravitational interaction, 581; compared to 
electromagnetic interaction, 768 

gravitational Lagrangian, 339 

gravitational lensing, 370-371 

gravitational mass, equality to inertial mass, 28, 257, 
268-269 

gravitational potential: action principle, 145; around 
black holes, 410-411, 411; connection to mass 
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gravitational potential (continued) 
distribution of, 578; Newton’s, 119; satisfying 
Poisson’s equation, 708 

“gravitational radius,” of massive objects, 764 

gravitational redshift, 259; “accelerated” /“dropped” 
gedanken experiments, 282-283; at black hole 
horizon, 412; measurement of, 284, 287; motion 
in curved spacetime, 303-304; and time dilation, 
284 

gravitational sources: approximations for, 570; 
strong: gauge transformations, 574, 575; weak 
field approximation, 569-570 

gravitational waves, 563-577; astronomy with, 
563; from binary systems, 714; degrees of 
polarizations, 564; detection of, 566, 577n; 
deviating Minkowskian spacetime, 571-572; 
emission of, 567; frequency dependence 
of scattering of, 717; localized packets of, 
577n; propagation of, 566, 568; removal by 
coordinate transformation, 577; similarities to 
electromagnetism, 568; speed of, 579; time and 
gravity, 579. See also gravitons 

gravitomagnetic field, 571 

graviton coupling: Einstein-Hilbert action of, 582; to 
electron line, 756 

graviton mass, 785 

graviton scattering, 782; frequency dependence, 717; 
off each other, 731; scattering amplitude of, 739, 
761, 770; unitarization by formation of black hole, 
765 

graviton spin, quadrupole radiation, 571 

gravitons, 566; Feynman propagator for, 573; 
fluctuating, 712; from gravitational waves, 780; 
Hawking radiation of, 439, 450; interaction 
among, 738-739; in large extra dimensions, 
696; from lattice system, 787; massless, 718; 
as non-bound states, 785; versus photons, 768; 
propagator, momentum of, 786; role of energy 
momentum for, 383; self interaction, 582; and 
spatial direction, 785; of spin 2, 697. See also 
gravitational waves 

gravity: action for (see Einstein-Hilbert action); 
as classical probe, 771; classicalization of, 766; 
completely altering causal structure of spacetime, 
780; connection with time, 579; container for, 
649; cubic vertex for, 744; Dysonian view on 
quantization of, 780, 788n; effective action for, 
711; effective field theory of, 766; without Einstein- 
Hilbert action, 771; in empty spacetime, 362; 
as fictitious force, 279; first order formalism 
for, 395; high energy behavior of, 767-768, 782; 
indifferent to the universe, 778; induced, 770; 
inherent instability of, 520; introducing an energy 
scale, 770; introducing natural quantities, 764; 
linearized, 563-577, 758n; mystery of, 778-779; 
“naked” singularities, 480; nonlinearity of, 571; 
non-quantization of, 768-769; omniscience of, 


and cosmological constant paradox, 745; as part 
of larger structure, 786; quantization of, 780; 
quantum, 439, 443-444; and spacetime, origins 
of, 787; and spacetime curvature, mystery of, 276; 
speculative thoughts about, 788; surface, 473; 
symmetry imposed on, 254; theory of, as analog 
to theory of light, 789n; time and, 257-258; time 
dilation caused by, 258-259, 284, 304, 412; true 
scale of, 698-700, 702; unification, 674-676, 767— 
768, 780; unimodular, 755-756; universality of, 
258, 269-270, 275-276. See also general relativity; 
Newtonian gravity 

gravity attraction, and strong energy condition, 
562n 

gravity express, 33 

gravity potential, particles moving in, tensor notation 
of, 57-59 

Gravity Probe B, launch of, 551n 

great circles, 127; on earth’s surface, 275; 
movement of, particle on sphere, 148; on sphere, 
determination of curvature, 105 

Greek symbols, in tensor notation, 63 

Green’s function: different determinations of, 573; 
for gravitational waves, 567-568; for quantum 
fluctuations, 447 

Greenwich Mean Time (GMT), 133n 

Grimm stories, and quantum gravity, 773n 

Grossmann, Marcel, and Einstein: paper on 
variational principle for gravity, 396; search for 
field equation for gravitation, 353 

ground states: degeneracy of, 723; in string theory, 
757 

group theory: and commutation, 49; and counting, 
56-57; of exponentially expanding universe, 
642-643; metric for expanding universe, 645; of 
universe, coset manifold, 644 

groups: 2-by-2 matrix as generator of, 663; 
decomposition of, 56-57, 56f; Eét-Wash, 260; 
Euclidean, as symmetry group of physics, 755; 
isometry (see isometry group); Lie, characteristic 
of, 50; Lorentz (see Lorentz group); Poincaré, 
transformations and translations of, 666; 
renormalization, and scaling, 754; representation 
of, 225; requirements of, 193; rotation (see rotation 
groups); subgroups, 57, 663. See also specific groups 

Gullstrand, Allvar, and Painlevé-Gullstrand 
coordinates, 417 

Gupta, Suraj, and curved spacetime, 580 

gyroscopes: gravitational precession of, 465; 
launched with satellite, 549; precession of, 
549-551 


h, explanation of symbol, 773n 

Hale, George, proposition of solar eclipse observation 
to test Einstein’s theory, 367n 

half plane, Poincaré: with differential forms, 608; 
and metric, 67-68 


Hamilton’s principle, and general theory of relativity, 
Lorentz’s paper on, 397 

Hamiltonian: derived from Lagrangian, 144; leading 
to gauge fields, 787; of zero value, in quantum 
systems, 723 

handle, in Euler characteristics, 726 

hanging membrane, 118f; as generalization of 
hanging string, 118 

hanging string: transition to falling apple, 137f; and 
variational calculus, 113-123 

harmonic gauge condition: in quantum field theory, 
573; strong gravitational sources, 575; weak field, 
564 

harmonic oscillator: actual path of, 148e; annihilation 
and creation operators, 447; energy of, 758n; 
in field theory (classical and quantum), 361; 
in quantum mechanics, 746; symmetry and 
invariance, 242 

Harrison, E. R., on masks of the universe, 779 

Hawking, Stephen, 14 

Hawking radiation, 436-450; derived from quantum 
field theory, 780; fundamental paper on, 14-15; 
of gravitons, 450; history of discovery, 449; as one 
key for understanding quantum gravity, 748. See 
also Gibbons entries 

Hawking temperature: determination of, 444-445; 
dimensional analysis, 14-15; and entropy, 441; of 
Schwarzschild black hole, 436 

heat, understanding of, 786 

Heaviside, Oliver, and Maxwell’s equations, 405n 

Heisenberg picture, of quantum physics, 
consequences for gravity, 771 

Heisenberg’s uncertainty principle. See uncertainty 
principle 

helicity, of gravitational waves, 734 

helicity spinors, 731; Lorentz invariance, 734; power 
of, 735; scattering amplitude expressed in terms 
of, 734-735 

helicity states, of graviton in QFT, 566 

helium: liquidity at zero temperature, 748; primeval 
nucleosynthesis of, 518 

hell, and hollow earth theory, 32 

Heron of Alexandria, 149n 

hierarchy problem, 699 

Higgs mass term, 712 

Higgs mechanism, 679 

high energy behavior, of gravity, 767-768, 782 

high-energy physicists, particle physicists renaming 
themselves, 713n 

high energy physics: linkage to low energy physics, 
752; naturalness doctrine in, 749-750 

high temperature superconductivity, 789n 

higher dimensional Einstein-Hilbert action, 681, 782 

higher dimensional metric, 682 

higher dimensional spaces: definition of, 43-44; 
embedding curved spaces, 85-86; rotation as 
freedom left, 88; rotations in, 44-45, 49-51 
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higher dimensional spheres, metric of, 80e 

higher dimensional theories: dynamics of geometry, 

693f; Katuza-Klein / Yang-Mills, 680-682; string 

theory, 695 

higher energies, larger structure of energy, 786 

Hilbert, David, and Einstein-Hilbert action. See 

Einstein-Hilbert action 

Hilbert-Einstein priority dispute, on field equations, 

396 

historical digressions, Newton's constant, 31-32 

Hodge star operation, 602; on differential forms, 
723-725 

hole argument, Einstein’s, 404 

“holes,” number of, 726 

hollow earth theory, 32 

holographic principle, 441; black hole entropy, 15; 
mapping of spacetime, 649 

homogeneity problem, 531 

homogeneous space, 289, 292, 491; definition of 
with Killing vectors, 588; in outgoing brane wave 
model, 704 

horizon: crossing in static coordinates, 635; de 
Sitter, 293, 636f; detection by indirect local 
measurements, 789n; event vs. particle, 536; 
inner, Kerr black holes, 469; outer, Kerr black 
holes, 468-469; at Schwarzschild radius, 412, 419, 
431-432; sound, 524; as source of confusion, in 
Schwarzschild solution, 376n; as switch of Killing 
vector, 631 

horizon problem, 530-533 

“How do you do>” 333 

Hoyle, Fred, and Big Bang, 498 

Hubble constant, 504, 632; and communication 
in expanding universe, 293; determination of, 
391; discovered by Lemaitre, 501; in inflationary 
cosmology, 535 

Hubble length, and critical density, 514 

Hubble parameter. See Hubble constant 

Hubble radius, of universe, 711; and photon mean 
free path, 517 

Hulse, Russell A., detection of binary pulsar, 563 

humans: distance between head and toe in spacetime, 
658f; existence of, and anthropic principle, 757 

hydrogen atom, and SO(4) group, 49n 

hydrostatic equilibrium, relativistic stellar interiors, 
453-454 

hyperbolic coordinates: angle, 628; anti de Sitter 
spacetime in, 661 

hyperbolic radial coordinate, 653-654 

hyperbolic shell, momentum restricted to, 220 

hyperbolic spaces, 92-93, 296, 627; as coset 
manifolds, 590; cosmological principle, 491; line 
element of, 628 

hyperboloid, of rotation, de Sitter spacetime, 625 

hypersurface, spacelike 3-dimensional, 693f 


ideal gas, equation of state of, 231 
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identities, versus equations, 403 

identity matrix, definition of, 39, 63 

illusion, of time, 177 

images, method of, 620; conformal algebra for, 620 

imaginary time, in derivation of Hawking 
temperature, 445-446 

impact parameter, light deflection: around black hole, 
416; and gravitational lensing, 371; in spacetime, 
309, 309f 

incompressible fluids, 454 

index-free representation of vector fields, 319 

index notation: under coordinate transformations, 
71-73; fear of, 32, 53; of quantities (in general), 
32; and rotations, 44-45; SO(D) group, 49; use of, 
43-44 

index summation. See summation convention 

indexed objects, handling by human mind, 607 

indices: changes in general relativity, 547; contraction 
of, 46n, 345; conversion with vielbein, 596, 603; 
different, 595; explosion of, Einstein’s gravity, 
131; of four-vectors, 182; magic of, 140; naming 
conventions, 608; object without indices not 
transforming as scalar, 719-720; order of, 
memorization, 132; repeated, contraction of, 
58n; in Riemann curvature tensor, 351; sea of, 
and differential forms, 599; summation over (see 
summation convention); upper and lower, 64, 
314-316; and vectors, scalars and tensors, 73-74. 
See also index notation 

induced gravity, 770 

inertia: “effect” of, 276; law of, action principle, 143; 
Sylvester’s law of, 193e 

inertial force, 278 

inertial frames, and locally flat coordinates, 278 

inertial mass: equality to gravitational mass, 28, 
34, 257, 268-269; Galilean transformation of 
accelerated frames, 276 

infinitesimal area, enclosed by closed curves, 547 

infinitesimal boosts, of Lorentz transformation, 187 

infinitesimal differences, 160 

infinitesimal rotations, 40 

infinitesimal segments, space and time experience 
of, 180 

infinitesimal transformations: as generators of 
conformal algebra for Minkowski spacetime, 615; 
in Lorentz algebra, 187 

infinitesimal volume element, and metric tensor 
formalism, 75-76 

infinity, and human mind, 779 

inflation of universe, 534-535; and cosmological 
constant paradox, 751; and scalar fields, 788n 

inflationary cosmology, 530-536; cosmological 
constant in, 534; Hubble parameter in, 535 

inflaton field, 534 

inflaton potential, 535f 

information paradox, of black holes, 439 


infrared regime: extreme ultra, and cosmological 


constant paradox, 750-751; linkage to ultraviolet 
regime, 752 

inherent instability in dynamics with higher powers 
of time derivative, discovery of, 338 

initial value formulation, in numerical relativity, 693 

initial value problems: in electrodynamics, 404; and 
numerical relativity, 400-405 

initial values: on Cauchy surface, 402; evolving 
in time, basic scheme of, 400-401; and gauge 
freedom, 401-402 

initially static branes, 707 

inner horizon, of Kerr black holes, 469 

innermost stable circular orbit (ISCO), 414, 474 

instability, inherent, of gravity, 520 

integrability condition, and determination of 
potential, 36 

integrands, analytically continued into complex 
plane, 732 

integration: by parts, 115, 326; variational calculus, 
116; over volume, at specific time, 226 

integration measure, covariant differentiation, 326 

interaction: with classical fields, 221n; contained 


in matter action, 384; contribution to energy 
momentum tensor, 383; as part of matter action, 
383 

interaction potential, particle movement in, 162 

interferometry, detection of gravitational waves, 567 

internal coordinates, 675; for points in spacetime, 
689f 

internal space: emergence of Yang-Mills theory, 688; 
spacetime perpendicular to, 689 

intersection, of geodesics, 134 

intrinsic curvature, 5—6; counting for characterizing 
of, 110; as defined by Gauss, 107; determination 
without knowledge of embedding of surface, 90; 
versus extrinsic, 107-108; and matrix eigenvalues, 
84-85; metric as prerequisite to calculate, 90-91; 
of spacetime, compared to extrinsic, 85 

intrinsic lifetime, of particles, 198 

Introduction to the Theory of Relativity (Bergmann), 
376n 

invariance: coordinate, general, 305-306, 672, 
682; CP, violation of, 528, 683; difference from 
covariance, 47; Galilean, of Newtonian mechanics, 
161; gauge, 248-250, 672; and geometry, 42— 
43, 48; local coordinate, in higher dimensional 
theories, 682; Lorentz, 242, 253; Noether’s 
theorem, 310; of physical laws, 46-48; of physics 
under general coordinate transformation, 403; 
Poincaré invariant brane, 707; rotational, 118, 697; 
scale and conformal, 621; of separation, 623e; of 
string action, 216e; and symmetry, 242-243; time 
reversal, 416-417, 500; under transformations, 
of Poincaré coordinates, 657; translation, 242, 
303-304 

invariance group of physics, rotation group as, 755 

invariant curvature, 339 


invariant scalar products, in parallel transport, 544 

invariant tensors, definition of, 59-60 

invariants, topological, 725-727 

inverse Compton scattering, 235e 

inverse length, dimension of, 120 

inverse light speed, analogy to cosmological constant 
paradox, 754 

inverse metric, 315 

inverse square law, 120; spatial dimensions, 122 

inverse temperature, 445 

inversion, of spacetime, 743-744 

invisible dimensions, 672-673 

irreducible representations, 54-57 

ISCO (innermost stable circular orbit), 414, 474 

isometric conditions, for metric, 586 

isometric spacetime, around rotating black holes, 
459 

isometry, 585-593; conformal transformations, 614; 
hidden underlying, 631; intuitive account of, 589; 
light cone coordinates, 631 

isometry group: of Ad S?, 663; for anti de Sitter 
spacetime, 650; of de Sitter spacetime, 625; 
equality with conformal groups, 656; and higher 
dimensional theories, 682; identical, for de Sitter 
spacetimes, 664 

isomorphism: between AdS? and SL(2, R), 663; of 
Lie algebra, and conformal algebra, 618 

isoperimetrical problem, 149e; Lagrange, solution of, 
144 

isotropic fluids, seen by comoving observer, 229 

isotropic space, 289, 292, 491; definition of with 
Killing vectors, 588; in outgoing brane wave model, 
704; spherically symmetric mass distribution, 305 

isotropic spacetime, static, motion in, 306-307 

isotropy problem, 531 


Jacobi identity, Bianchi identity derived as special 
case of, 393 

Jacobian, 216; changes of, 235; and coordinate 
transformations, 75; differential forms of, 598 

Jacobian determinant, for Lorentz transformations, 
188 

Jeans, James, structure formation in early universe, 
520 

Jebsen-Birkhoff theorem: with gravitational waves, 
568; Newton-Jebsen-Birkhoff theorem, 453; and 
time dependent spherically symmetric mass 
distribution, 373-374 

Jordan, Pascual: anticommutation manuscript, 789n; 
stars made of nothing, 456 

Jordan frame, 686 


Katuza, Theodor: letter to Einstein, 671; letters from 
Einstein, 693-694 

Katuza-Klein action, 686; in Jordan frame, 686 

Katuza-Klein metric, 676, 680; in vielbein formalism, 
690-691 
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Katuza-Klein theory, 671-695; charge conjugation 
and antimatter in, 678; charge quantization in, 
677; coordinate invariance in, 672; Einstein- 
Hilbert action in, 675; electromagnetic field in, 
691; escape problem in, 673-674; as foliation, 
689-690; gauge invariance in, 672; geometrical 
view of, 691-693; and grand unified theory, 672; 
higher dimensional, 680-682; linking of internal 
and external geometries, 691; Lorentz action in, 
678; Maxwell action in, 675-676; motion of point 
particles, 676; phase angle of wave function in, 
678; Planck length and charge quantization in, 
677; Planck mass in, 675; transformations in, 672; 
and uncertainty principle, 674; visibility problem 
in, 673-674; and Weyl, 693-694. See also quantum 
gravity; string theory 

Katuza-Klein towers, 679 

Kasner universe: as solution of Einstein’s field 
equation, 361e; with differential forms, 613e 

Kepler’s third law: orbits around black holes, 
413-414, 417; precession of gyroscopes, 549 

Kerr, Roy, and rotating black hole solution of 
Einstein’s field equation, 458, 461 

Kerr black holes, 462, 464-467; angular momentum, 
465, 571; angular velocity for, 462f; mass 
determination, 570; no-hair theorems, 481— 
482; and Schwarzschild black holes, 468; Weyl 
approach, 473. See also rotating black holes 

Kerr metric, 465-466, 475 

Kerr-Newman solution, 477 

Kerr-Schild form, 476 

Kerr spacetime: Killing vectors, 470-471; radiation 
from rotating black holes, 473 

Killing, Wilhelm, and Lie algebra, 586 

Killing condition, conformal, 614 

Killing vector fields, 332, 585-593; conformal, 614, 
623e; definition of, 586 

Killing vectors: admitted by spacetime, 636; 
derivation of curvature tensor, 591; emergence 
of Yang-Mills theory, 688; great circles, 127n; 
and higher dimensional theories, 682; for Kerr 
spacetime, 470-471; and Lie algebra, 591; linear 
combinations of, 587; in Riemannian manifold, 
588; for spacetime around rotating black holes, 
459; for spherically symmetric mass distribution, 
305; for static isotropic spacetime, 310; timelike 
and spacelike, 637; from timelike to spacelike, 
631 

kinematics, relativistic, 221 

Klein, Oskar: Klein-Gordon equation, 694. See also 
Katuza-Klein entries 

Kraichnan, Robert: curved spacetime, 580; “particle 
physics” approach, 583n 

Kretschmann scalar, 365n 

Kronecker delta: definition of, 36, 70; discrete 
variables in functional variations, 121; indices of, 
183; as invariant tensor, 60; use of, 45 
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Krupp (munitions manufacturer), financing of solar 
eclipse expedition to the Crimea, 370 

Kruskal, Joseph B., paper on spherical singularity, 
376n 

Kruskal coordinates, for elimination of singularity of 
Schwarzschild solution, 365 

Kruskal-Szekeres coordinates, 424-425; wormholes 
in, 432-433 

Kruskal-Szekeres diagram, 425-427; of 
Schwarzschild black hole, 426f; Unruh 
effect, 447 

Kruskal-Szekeres-like coordinates, for de Sitter 
spacetime, 635, 636f 

kung fu stories, 470n 


Lagrange, Joseph-Louis: tautochrone problem, 144; 
variational calculus, 120 

Lagrange multiplier, 148; introduction of, 106; notion 
of, 109; for volume of spacetime, cosmological 
constant as, 756 

Lagrangian: for 2-brane model, 700; in action 
principle, 138; in curved spacetime, determination 
of, 712; gravitational, 339; infinitesimal 
transformation, 151; Maxwellian, 249, 255; of 
motion in static isotropic spacetime, 306; in 
nonrelativistic mechanics, 138-139; of relativistic 
point particle action, 211; Schwarzschild 
spacetime, time reversal invariance, 417; terms 
added for determination of gravitational with 
respect to electromagnetic field, 338; without time 
dependence, energy conservation, 153 

Lanczos, Kornel, corrections to de Sitter metric, 289, 
642 

Landau, L. D., Green’s function approach, 577n 

Laplace, Pierre-Simon: black hole hypothesis, 13; 
Michell-Laplace argument, 409 

Laplace’s equation: for strong gravitational sources, 
574; and tensor notation, 58 

Laplace-Runge-Lenz vector, definition of, 60 

Laplacian: definition of, 61; in membrane shape 
determination, 118; notation in various coordinate 
systems, 78-79 

“lapse,” 691, 693 

large extra dimensions, 696-707 

Large Hadron Collider, 699 

Larmor, J., Lorentz transformation, 169n 

Laser Interferometer Gravitational Wave Observatory 
(LIGO), 577n 

Laser Interferometer Space Antenna (LISA), 577n 

laser interferometry, detection of gravitational waves, 
567 

laser light, box hit by, 281f, 283f 

Latin symbols, change to Greek symbols, in tensor 
notation, 63 

lattice gravity, 726n; as approach to quantum gravity, 
760 

lattice Hamiltonians, leading to gauge fields, 787 


laws. See specific laws 

Le Verrier, Urbain, prediction of Neptune, 368 

Leaning Tower of Pisa, 270 

least path principle: and curvature, 5-6. See also path 

least time principle, 4, 136; connection with action 
principle, 139, 144; Feynman’s path, 3-4. See also 
time 

Legendre polynomials, 523 

legs. See reference frames 

Leibniz, Gottfried: brachistochrone problem, 120; 
discovery of calculus, 113; notation of action 
principle, 138 

Lemaitre, Georges: closed and open universes, 
296-297; Hubble constant, 501; as triple winner, 
500 

Lemaitre—de Sitter cosmology, 712 

Lemaitre—de Sitter metric, 357; generalized, 489 

Lemaitre—de Sitter spacetime, 642 

length: inverse, dimension of, 120; minimization 
of, 125; parametrization in general metric, 128; 
parametrization independence of, 130; of rulers, 
in special relativity, 199; units for, 10, 633 

length contraction, 199-200 

length element, on unit circle, 80e 

length scales: cosmological, physics on, 750; and 
cosmological constant, 748; and deviation from 
Newtonian gravity, 709; leading to cosmological 
constant paradox, 711 

Lense-Thirring precession, 550; alternative 
derivation, 551 

leptogenesis, 526-528 

leptons, families of, 786 

Levi-Civita symbol, 252; used to contract indices, 719 

YHospital, Marquis de, brachistochrone problem, 
120 

Lie, Marius Sophus: infinitesimal rotations, 40; 
infinitesimal transformations, 154; method for 
derivation of Lorentz transformation, 187-188 

Lie algebra: definition of, 50-51; discovered by 
Killing, 586; generators of, 49; isomorphism of, 
618; and Killing vectors, 591; of rotation groups, 
191 

Lie derivative, 327-328, 331-332; and geodesic 
deviation, 555 

Lie’s equation, and emergence of Yang-Mills theory, 
688 

Lie groups, characteristic of, 50 

Lifschitz, E., structure formation in early universe, 
520 

light: “accelerated” /“dropped” gedanken 
experiments, 281-282; least time principle, 
4; Maxwell’s explanation, 162; motion around 
black holes, 409-418; motion of, 307-309, 416f, 
659; propagation of, in medium, 163; theory of, 
as analog to theory of gravity, 789n; unification 
with material particles, 207-217, 212-213. See also 
deflection of light; photons 


light cones: closing up, 420f; coordinates of, 146-147, 
170-171, 427-429, 619, 631, 704; in expanding 
universe, 294, 294f; past, 177-178, 179f; spanned 
in Minkowski space, 177; tilting, at Schwarzschild 
radius, 420-421, 421f 

light deflection. See deflection of light 

light flashes, in trains, 166 

light paths: anti de Sitter spacetime, 656; depending 
on geodesics, 665 

light pulses, dueling thinkers experiment, 7 

light rays: corotating/counterrotating, 461, 469; more 
fundamental than spacetime events, 741; moving 
at 45°, 423; surfaces generated of, 185 

light signal trajectories, in static spacetime, 304f 

light speed, 162; constancy of, effect on notion of 
simultaneity, 8; determined by Maxwell’s theory, 
162-163; in expanding universe, 294; inverse, 754; 
in metals, ratio to sound speed, 749; as velocity of 
massless particles, 213 

lightfoot, not a unit of time, 773n 

lightlike 4-momenta, 782 

lightlike distance, 175 

lightlike geodesics, 292, 646 

lightlike lines, in general spacetime, 730 

lightlike momentum, complex, 733 

lightlike vectors, 731 

lightsecond, natural unit of distance, 168 

lightyear, as length unit, 10 

LIGO (Laser Interferometer Gravitational Wave 
Observatory), 577n 

limit surface, stationary, angular velocity inside, 

471 

line: of constant time, de Sitter spacetime, 637; in 
twistor space, 742. See also straight line; worldlines 

line element: 5-dimensional, 676; of hyperbolic 
space, 628; square of, and metric, 64 

linear combinations, and tensors, 53 

linear transformations, rotations as, 68 

linearity of transformation matrix, 313 

linearity requirement, Galilean transformation, 18 

linearized gravity, 563-577, 758n 

LISA (Laser Interferometer Space Antenna), 

5770 

lithium, primeval nucleosynthesis, 519 

local action, electromagnetic, 246 

local coordinate invariance, in higher dimensional 
theories, 682 

local curvature, measurement of, 547 

local field theory: and cosmological constant paradox, 
756; invariance of physics, 621 

local flatness: of curved surface, 83; for spaces of any 
dimension, 86-87 

local gauge symmetry, in higher dimensional 
theories, 682 

local Lagrangian, in action, 783 

local measurements, indirect, detecting horizons, 
789n 
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local observables, 765; absence of, in quantum 
gravity, 772 

locality: as fundamental principle of theoretical 
physics, 783; of physics, 757 

locally exact forms, 604 

locally flat coordinates, 557; determination of, 132; 
and inertial frames, 278; for investigations of 
symmetry relations, 343-344; Minkowskian, 288; 
nearby geodesics, 552; transformation of polar 
coordinates into, 89 

ocally flat Euclidean metric, second order corrections 
to, 88 

ocations: of events in spacetime, 195; and spatial 
coordinates of particles, difference between, 31 

ong distance behavior, of action terms, 722 

ong distance expansion, deviation from Newtonian 
gravity, 708-709 

“long distance physicists,” 713n 

oop quantum gravity, 772 

Lorentz, Hendrik: and Droste’s solution of Einstein’s 


field equation, 375; paper on Hamilton’s principle 
and general theory of relativity, 397; paper on 
variation of Lagrangian, 396; understanding of 
waves, 783-784 

Lorentz action, in Katuza-Klein theory, 678 

Lorentz algebra, 187; extension to Poincaré algebra, 
192, 617 

Lorentz boost: of 4—vector, 230; of mass density, 579; 
SL(2, C) group, 730 

Lorentz contraction: of box with particles, 23; and 
number density, 223f 

Lorentz-Fitzgerald length contraction, 199-200; pole 
in the barn problem, 202 

Lorentz force law, 245, 247; movement of charges, 
404 

Lorentz group, 188, 218; connection to rotation 
group, 192; covered, 729-730; SO(3, 1), 730 

Lorentz indices, 594, 608; conversion with vielbein, 
603; versus world indices, 595 

Lorentz invariance: beyond cosmological length 
scale, 754; helicity spinors, 734; Maxwell’s 
equations, 253; Newtonian action, 242; of physics, 
218; of spacetime, 666 

Lorentz scalar: definition of, 218; and density 
distribution, 579 

Lorentz symmetry, restrictions imposed on 
electromagnetism, 339 

Lorentz tensors, 188, 243 

Lorentz transformation, 166-173; alternative route, 
172; within cars, 205; and curved spacetime, 317; 
definition in Minkowski spacetime, 181-182; 
invariance of, 186; low velocity limit of, 169; sneak 
preview of, 147; tensors under, 193e 

Lorentz vector, Pauli spinors as “square root” of, 731 

Lorentz vector potential, 243, 248 

Lorentzian Lagrangian, 249 

Lorenz gauge, in electromagnetism, 564 


842 | Index 


low energy effects, of quantum gravity, 767 

low energy physics: linkage to high energy physics, 
752; understanding of, 750 

low energy world, neglect of quantum gravity, 766 

lower indices, 314-316; introduction of, 64; 
transformations in change of coordinates, 71-73 

luminosity distance, 297 


macrostates, Bekenstein- Hawking entropy, 441, 444 

magnetic field, 245; relativistic unification, 247; 
Schrédinger equation for (nonrelativistic) charged 
particle in, 354n 

magnetic flux lines, Faraday’s picture of, 728 

magnetic moment, of atom, and action, 715 

magnetic monopoles, 81; bosons bound to, 789; 
Newtonian approximation of Einstein’s field 
equation, 577; relic problem, 532; topological field 
theory, 728 

Maldacena, Juan, and quantum gravity, 649 

manifolds: Calabi-Yau, 695; coset (see coset 
manifolds); Riemannian, 599-600; rotations 
determined in, 590; topology of, and ground 
states, 723; without boundary, 727 

many particle case, and fields, 400 

many particle systems. See fluids 

many worlds interpretation, of quantum, 780 

map. See Mercator map 

mapping: of heaven and earth, subdivision of degree 
(proposal by Ptolemy), 368n; of twistor space to 
spacetime, 742 

marble, positional variations in bowl, 114 

marine recruit in boot camp, following rotation 
commands, 50f 

mass: changes of, 221; as conversion factor between 
geometry and physics, 211; definition of, in 
elementary physics, 213; of elementary particles, 
16n; gravitational and inertial, 257, 268-269; 
Planck (see Planck mass); role of, in action 
principle, 142; spherically symmetric distribution, 
304-307, 310-311, 409; of universe, 747-748 

mass density transformation, under Lorentz boost, 
579 

mass dimensions: and dimensions of scalar 
curvature, 711; role in quantum field theory, 
711-712 

mass distribution: and gravitational potential, 578; 
from point masses, 119; rotating, gravitational 
sources, 569; spherically symmetric, 373-374, 
569, 571 

mass loss, of radiating atoms, 232 


mass scale: of cosmological constant, 700; as limit of 


understanding of quantum field theory, 746 
mass shell condition, 220, 464 
massive objects: “gravitational radius” of, 764; 
motion of, 659-660, 659f; worldlines of, 175 
massless particles, 307-309; accelerated relativistic, 
277; gravitons, 718; motion around black holes, 


415-416; mystery of, 213; natural parametrization, 
308; preferred parameter choice for, 215; 
relativistic action principle for, 213; worldlines of, 
175 

material particles: Euler-Lagrange action of, 207; 
unification with light, 207-217, 212-213 

mathematical entities, as tensors, 52 

mathematical universes, 634 

mathematics: difference from arithmetic, in terms of 
rotations, 56; as poetry of logical ideas, 150 

matrices: antisymmetric, introduction of, 40; 
commutation of, 41; exponential of, 41; as group 
generator, 663; introduction of, 39-40; and 
operators, 48; rotation matrix, definition of, 38; of 
spacetime metrics, 183; transpose of, 45 

matrix algebra, quick review of, 742-743 

matrix differentiation, 322 

matrix elements, counting of, 87-90 

matrix theory, for relativistic action, 210 

matter: baryonic, 502-503, 506; dark (see dark 

matter); observational evidence, 503f; spherical 

shell of, 423£ 

matter action: contribution of Maxwell action to, 378; 
fields contained in, 384; general invariance of, 
and conservation of energy momentum, 383-384; 
generic, 386; interaction as part of, 382-383; as 
part of action of world, 378; variation of, 378-379 

matter-antimatter asymmetry, 528; in higher 

dimensional theories, 683 

matter density, and scale factor of universe, 496f 

matter dominance: and coincidence problem, 499; 

and photon decoupling, 788n 

matter equation of motion, and matter action, 

386 

Matthew principle, 520; Birkhoff theorem as example 
for, 376n; Lorentz transformation as example for, 
169n; the rich inheriting from the wimps, 523 

maximal symmetry: anti de Sitter spacetime, 650; 
and coset manifold, 625 

maximally symmetric spaces, 585-593; negatively 
curved, 610 

Maxwell, James C., versus Galileo, 159 

Maxwell action, 325, 332; Chern-Simons term 
added to, 721; contribution to matter action, 
378; and differential forms, 724-725; general 
invariance, 384, 394; in Katuza-Klein theory, 675— 
676; long-distance behavior, 722; in Minkowskian 
spacetime, 381; scale and conformal invariance, 
621; vanishing by variation, 384; from weak field 
action, 572. See also electromagnetic action 

Maxwell’s equations, 252-253; and Bianchi identity, 
724; for charged black holes, 477; coupled to 
Einstein’s field equations, static solutions, 482— 
483; in curved spacetime, 333; free, 251; and 
initial value problem, 404 

Maxwell field, in terms of Yang-Mills field, 789n 

Maxwell Lagrangian, 249, 255, 382 


Maxwell’s laws of electromagnetism, and Galilean 
transformation, 20 
Maxwellian electromagnetism: differences to 


Newtonian gravity, 338; gauge freedom and initial 
value, 401-402; speed of light, determined by, 163 


Mead, C. Alden, generalized uncertainty principle 
and quantum gravity, 769 

mean free path of photons, 517 

measuring device, collapsing into black hole, 
763-764 

mechanics: immediate formulation of, 142; least 
action formulation of, 139 

medium, for propagation of light, 163 

membranes: hanging, as generalization of hanging 
string, 118; from null surfaces, 185 

Mercator, Gerardus, importance of angles, 620 


Mercator map: and coordinate transformations, 79e, 


94; singularity at poles, 365; of the world, 620 

Mercury, perihelion shift, 368-369, 369f 

messages, paths through spacetime, 638 

metals, ratio of sound speed to light speed, 749 

metric: in case of isometry, 586; change under 
coordinate transformations, 70-71, 110; chosen 
in Riemannian manifold, 88; conformally 
flat, 830-81e, 94, 352e-353e; constraints on, 
403; for contracting spacetime indices, 719; in 
cosmological action, 357; definition range in 
parallel transport, 544; determinant of, 215-216; 
and different indices, 595; differentiation of, 131; 
as dot product of vielbein, 603; for expanding 
universe, group theory, 645; in Fermi normal 
coordinates, 559, 561; flat space, 77; formed by 
coordinate scalars, 708-709; induced by ambient 
Euclidean metric, 86; integral over, 770; intrinsic 
curvature calculation, 90-91; invariance under 
scaling, Poincaré coordinates, 657; Lemaitre-de 
Sitter, 357; and line element, 64; for lowering 
or raising indices, 74; not related by coordinate, 
81e; restriction by isometric condition, 586; 
role in differential geometry, 66; Schwarzschild, 
discovery of, 364; second order deviation of, 343; 

of space, 128; in spacetime, 181, 716; on sphere, 


for spinor indices, 742; on surface, in Euclidean 
space, 99; of surface of sphere, 83-84; time- 
independence of, 636; transformation in terms 
of matrices, 72-73; two powers of derivatives 


spheres, 80e; variations in spacetime, 716 

metric formalism, derivation of divergence and 
Laplacian, 78-79 

metric tensor: of 3-sphere, 296; covariant derivative, 
325; divergence near boundary, 663; and general 
coordinate transformations, 314; general static 
and isotropic, 306; generalized Lemaitre-de 


Sitter, 489; higher dimensional, 682; inverse, 315; 


Katuza-Klein, 676, 680, 690-691; Kerr, 465-466, 


determination of, 65; in spherical coordinates, 108; 


acting on, 349; unfamiliar of spheres, 585; on unit 
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475; Minkowskian (see Minkowski metric); near- 
horizon Schwarzschild, 445-446; off-diagonal 
components, 292, 459, 466, 474; Rindler, 445- 
446; for space measurements, 63-64; spatial, 
cosmic expansion, 491; time dependent, 455; time 
translation invariance of, 304 

Michell, John, black hole hypothesis, 13 

Michell and Laplace, mass of black hole, 366, 
409 

Michelson-Morley experiment, 163; explained by 
length contraction, 200 

microscopic physics, and topological action, 721 

microstates: Bekenstein- Hawking entropy, 441, 444; 
in de Sitter spacetime, 638 

microwave background, cosmic. See cosmic 
microwave background 

Mie, Gustav, Newton gravity and Lorentz invariance, 
580 

Mills, Robert L. See Yang-Mills theory 

minimum, as solution of variational calculus, 117 

minimum length measurement, limited by special 
and general relativity, 763-764 

Minkowski, Hermann: “mystical” substitution, 640; 
on physical laws between worldlines, 176; on 
space, time, and spacetime, 174 

Minkowski metric, 317, 391; and Einstein’s field 
equations, 563; folded into indices, 182; Rindler 
coordinates, 446 

Minkowski spacetime: (1+1)-dimensional in light 
cone coordinates, 619; accelerated relativistic 
particles, 277; acceleration in, 190; coordinate 
changes, 192e; curves, in, 175; deviations due to 
gravitational waves, 571-572; Dirac action in, 605; 
distance in relativistic action, 210; flat, governing 
action of, 581; generators of conformal algebra for, 
615; geometry of, 175, 191; locally flat coordinates, 
288; maximal extension of, 434; Maxwell action for 
electromagnetic field in, 381; Penrose diagram, 
428f, 434; spherical shell of photons in, 430f; 
surfaces in, 184 

Minkowskian sphere, including time, 631 

Minkowskian time, compared to Newtonian time, 
372 

minus sign, role of, in energy functional, 139 

Misner, Charles W., ADM formulation of gravitational 
dynamics, 693 

mites: flat space analogy, 6; geometer measuring 
curvature, 545 

MIT system, reduction to nothing, 11 

modes, electromagnetic field treated as superposition 
of, 746 

molecules, appearance in early universe, 519 

momentum: angular (see angular momentum); 
energy (see energy momentum entries); exact 
meaning of, 383; of gravitational field, 580-581; 
of graviton’s propagator, 786; Hamiltonian, 144; 
not conserved, 26, 27; physical, and twistors, 731; 
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momentum (continued) 
restricted to hyperbolic shell, 220; spatial density 
of, 228; total, conservation of, 37 

momentum conservation, 219; delta function, 740; 
derivation of Einstein’s formula, 232; in terms of 
helicity spinors, 736 

momentum-twistor space, scattering amplitudes as 
volumes of polytopes in, 742 

monopoles, magnetic. See magnetic monopoles 

Morley, Edward. See Michelson-Morley experiment 

Mossbauer effect, measurement of gravitational 
redshift, 284 

mother: of all headaches, plaguing fundamental 
physics, 699; of all vectors, 312-313 

motion: around black holes, 412-416; in curved 
spacetime, 289-290, 301-311; effect on 
coordinates of, 160; in fifth dimension, 676; of free 
particles, 180; law of, 25; relative, of observers, 
168, 181; in static isotropic spacetime, 306-307 

movement: at constant speed, of objects under 
special relativity, 189; along curve, through vector 
field, 544; fuel-economizing, 127 

moving observer, fluids, 230 

moving trihedron, of smooth curve, 97f 

multipole expansion approximations, 568-569 

“Must it be? It must be.”: discovery of action for 
gravity, 346, 346f 

My World Line (Gamow), 177 

mysteries: action principle, 141, 155; Bekenstein- 
Hawking entropy, 444; Bering Strait, 275; black 
holes, 410, 441; caloric, 786; closing orbits, 
30, 60; correspondence between quantum 
statistical mechanics and quantum field theory, 
445; cosmological constant, 356, 711, 751, 782; 
cosmos, 778; “crazy” coordinates, 94; dark 
energy, 356, 711; Einstein’s field equation, 358; 
entropy in spacetime, 234; equality of inertial 
and gravitational mass, 28; family problem, 7; 
fermions, 781-782; Gibbons-Hawking radiation, 
637; grand unification, 527n; gravity, 276, 
778-779; holographic principle, 15; light and 
electromagnetic field, 162; massless particle, 
213; neutrino mass, 359; quantum, 780, 789n; 
quantum gravity, 748, 781; as source of beautiful 
experience, 778; temperature, 15; three copies of 
world, 7; time, 787; universe, 779 


“naked” charged black holes, 478 

Nambu-Goto action, 216e 

naming conventions, for indices, 608 

Nash, John, and embedded spaces, 95 

Nasty and Vicious, dueling thinkers experiment, 7-9 

“natural” coordinate systems, 134 

natural parametrization, 308 

natural quantities: introduced by gravity, 764; and 
unnatural quantities, 218 

natural system of units, 10-12 


naturalness doctrine, 579; in high energy physics, 
749-750; and inverse light speed, 754-755 

Navier-Stokes equation, 234; in fluid dynamics, 164 

near-horizon Schwarzschild metric, 445-446 

negative curvature, definition of, 85 

negative pressure, as consequence of constant dark 
energy density, 360 

negatively curved space, maximally symmetric, 610 

neutral objects, impossibility of under gravity, 716 

neutrino masses, as mystery, 359n 

neutrino oscillations, and cosmological constant 

paradox, 747 

neutrinos: (non)relativistic, 501; scattering of, 765; 

“typical” mass scale of, 700 

neutron interferometry, and equality of inertial and 

gravitational mass, 34 

neutrons: mass of, and anthropic principle, 757; 
primeval nucleosynthesis of, 517-518 

“new and improved” energy momentum tensor, 712 

Newman, Ezra T., Kerr-Newman solution, 477 

Newton, Isaac: action principle, 144; apple falling on, 
268; comparison to Aristotle, 140-141; discovery 
of calculus, 113; existence of God, 520; on his 
youth, 25; inherent instability of gravity, 520; 
miraculous year, 194n; role of second derivative 
in time, 401; shown with orbits on one pound 
note, 31; unification of celestial and terrestrial 


mechanics, 28 

Newton’s constant: Cavendish’s first measurement 

of, 32; dimension of, 346; historical digression on, 

31-32; and quantum gravity, 761 

Newton’s dot notation, 29, 96 

Newton-Einstein-Hilbert action, quantum gravity 
limit, 444 

Newton-Jebsen-Birkhoff theorem, 453 

Newton’s laws, 25-34; law of action and reaction, 

470; law of gravity, 11, 28; as result of variation 

principle, 137; second law, 46-48, 110, 140 

Newton-Leibniz rule: breaking of, 340-341; failure 

1 covariant derivatives, 342 

Newton’s superb theorems, 32-33 

Newtonian action, 241-242 

Newtonian approximation, Einstein’s field equation 
in, 577 

Newtonian equation, “analog,” 367 

Newtonian gravitational potential: around black 
holes, 410-411, 411f; compared to Einstein 
potential, planetary orbits, 371; dynamical origin 
of, 578n; fields, 119; quantum gravity corrections 
to, 767; replacement of mass density by energy 
density, 379n 

Newtonian gravity: cube of physics, 13f; deviation 
from, and powers of derivatives, 708-709; 
differences from Maxwellian electrodynamics, 
338; restriction imposed by symmetry, 339; as 
weak gravitational field limit, 391 

Newtonian Lagrangian, 249 


oy 


Newtonian limit, 302-303 

Newtonian mechanical analogies, from cosmological 
principle, 507, 513 

Newtonian mechanics: and black holes, 13; 
conservation laws in, 35-37; cube of physics, 13f; 
Galilean invariance of, 161; initial value problem 
in, 400; invariance of, 161; invariance under 
Galilean transformation, 19; reproduction by 
relativistic particle action, 209; role of differential 
equations, 26-27; role of signs in, 382; standard 
notation of coordinates, 25; tensors in, 57-59 

Newtonian orbits: closing of, and tensor notation, 60; 
determination of, 31 

Newtonian time, compared to Minkowskian time, 
372 

Newtonian universe, role of time in, 7 

no-hair theorems, and Kerr black holes, 481-482 

Nobel prize in physics (2011), and dark energy, 361n 

Noether, Emmy, 150; spacetime hidden in scattering 
amplitude, 739-740 

Noether’s theorem: application of, 152; generality 
of, 153; motion in static isotropic spacetime, 310; 
promotion of physical laws, 221; proof of, 151 

non-determinism, of Einstein’s field equations, 403 

non-quantization, of gravity, 768-769 

nonabelian gauge theory, 681; decoupling of 
geometries, 692 

noninteracting free particles, 221 

nonlinear coordinate transformations, 69 

nonlinear gravity, 571 

nonlocal cosmology, 712 

nonlocal phenomena, removal, 784 

nonlocal terms: in action, 751; and cosmological 
constant paradox, 751 

“nonphysical” degrees of freedom, 783 

nonrelativistic action, 241-242 

nonrelativistic gases, 454 

nonrelativistic matter. See dust 

nonrelativistic mechanics, Lagrangian in, 138-139 

nonrelativistic particles, in potential, action of, 356 

nonrelativistic physics, completion and promotion of 
quantities in, 218 

nonrelativistic quantum mechanics, 438; in presence 
of gravitational field, 12-13 

nonrenormalizable interactions, 711-712 

Nordstrom, Gunnar: derivation of Einstein’s gravity, 
579. See also Reissner-Nordstrém entries 

Nordstrém’s theory, road to higher dimensional 
theories, 682-683 

normal, to surface: at certain point, 99f; as timelike 
vector, 184 

normal coordinates, Fermi, 557 

normal vector, tangent plane rotating around, 100 

north pole, and its longitude, 76 

notation: of action principle (Leibniz), 138; of column 
vectors, 45; confusion in variational calculus, 117; 
convenient for vectors, 182; of coordinates, 25, 
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62n; cross-product, angular momentum, 48n; 
for differential operator, 72; dot: as symbol for 
symmetry, 29, 96, 129; erroneous, in parallel 
transport, 543; of functionals, 114; of gradient, 
54; group theory of universe, 644; index (see index 
notation); Laplacian, 78-79; of quantities (in 
general), 32; spacetime metric, 183; tensor (see 
tensor notation) 

notation alert, bad: confusion in time dilation, 198; 
confusion in relativistic action, 211; geodesic 
equation, 555 

nothing, waving of, 783 

nuclear force, generated by pions, 205 

nuclear fusion, compared to accretion disk radiation, 
415 

nuclear physics, in early universe, 518 

nucleons, formulation of strong interaction, 785 

nucleosynthesis: primeval, 517-518; stellar, 518-519, 
758 

null infinities, in Penrose diagrams, 428, 428f 

null lines: in general spacetime, 730; in spacetime, 
741£ 

null surfaces, 184; acting as membrane, 185; black 
hole horizons as, 422, 468 

number current: inside 3—volume, 226f; as 4—vector, 
225f 

number density: as component of Lorentz-vector, 224; 
of particles in box, 223f; relativistic completion of, 
223; in relativistic form, 224 

numerical relativity: initial value formulation, 693; 
and initial value problems, 400-405; setting up, 403 


obesity index of universe, Schwarzschild radius and, 
443 

observables: appearance of antimatter, 205; 
Heisenberg picture, 771; local, 765, 772, 781; 
quantum mechanics, 48 

observational cosmology, 491, 505 

observers: accelerated, 193, 446-447; different, 185; 
freely falling, metric for, 561; moving and resting, 
166-168; relative motion of in spacetime, 181; 
role in physics, 46-48; studying vector field, 47f; 
uniform relative motion of, 168. See also reference 
frames 

odd-dimensional space, space reflection in, 721n 

offshell information, carried by action, 782 

old man’s toy, 267f 

Once and Future King, The (White), 361n 

one pound note, showing Newton with orbits, 31 

open strings, 696 

open universes, 296-297, 491, 629; critical density, 
497-498; Einstein’s field equations, 493-494; with 
positive cosmological constant, 633 

operational definition of distance, 291, 291f 

operators: annihilation and creation, 447-448; 
differential, 48, 72, 319, 588; quantum, 771, 772 

orbifolds, 700 
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orbits: circular, 413-414, 413f, 549; closed, 
verification of, 30; for light moving around black 
hole, 416f; properties of, precession of gyroscopes, 
550 

ordinary differential equations, coupled, relativistic 
stellar interiors, 452 

orthogonal matrices, definition of, 39 

orthonormal frames, 594; erecting, 595f 

oscillator, harmonic, 447; symmetry and invariance, 
242 

osculating plane, of smooth curve, 97f 

Ostrogradsky, M. V., discoverer of inherent instability, 
338 

outer horizon, of Kerr black holes, 468-469 

outgoing brane wave model, 704-707 


p-form, definition of, 597 

Page, Don, Hawking radiation, 449 

Painlevé, Paul, 417 

Painlevé-Gullstrand coordinates, 417 

pair production, 438 

Palatini formalism: action for Einstein gravity, 395; 
derivation of Einstein’s gravity, 583; invention by 
Einstein, 397 

Palatini identity, 389-390; mixing up with Palatini 
formalism, 397 

parabolas, bending in opposite directions, and 
negative curvature, 85 

paradoxes: pedagogical aspects of special relativity, 
203-204. See also cosmological constant paradox 

parallel transport, 543-548; precession of gyroscopes, 
549; of vectors, 101-102, 102f, 545f 

parameter choice for massless particles, 215 

parametrization: invariance of: current, 133, 235; 
natural, 125, 308; of surface, 98; ultrarelativistic 
particle motion, 308 

parametrized post-Newtonian (PPN) approximation, 
309-310, 311 

parity: strong gravitational sources, 574; and space 
reflections, 721n 

partial differential equations, solving, 708 

particle-antiparticle pairs, thermal radiation from 

horizon, 637 

particle cloud, motion of, described by geodesic, 556 

particle collisions, 438; momentum, 219-220 

particle decay, conservation, 237n 

particle horizon, 536 

particle location, versus spacetime, 224 

particle mass, as proportionality factor in relativistic 

action, 211 

particle motion, 198; free, 180; in future light cone, 
178f; in interaction potential, 162; law of inertia of, 
143; multiple coordinates, generalization of, 140; 
in potential, 57-59, 135, 137; simplest case of, 142 

particle physicists, renaming themselves high-energy 
physicists, 713n 

particle physics: approach to gravity, 583n; 


baryogenesis and leptogenesis, 526-528; in 
early universe, 518; evolving of, 753n; scale and 
conformal invariances, 621; standard model of, 
683 

particle theory, use of scalar fields in, 759n 

particles: accelerated: and general relativity, 189, 
193e; anti- (see antimatter); birth and death of, 
198; around black holes, 409-418; in box, number 
density of, 223f; corotating/counterrotating, 474; 
de Broglie wavelength at Schwarzschild radius, 
442; electromagnetic field acting on, 246, 250; 
under external force, 190; and fields, 145-146, 
384; of finite size, motion of, 714; free, 302; and 
gravitational waves, 566; intrinsic lifetime of, 198; 
massive, 659-660, 659f; massless (see massless 
particles); near barrier, path integral formalism 
for, 781; noninteracting (see dust); notation of 
position, 117; point (see point particles); relativistic 
action, 208-209; at rest, Newton’s laws, 142; 
ring of, responding to gravitational wave, 567f; 
scattering of, Lorentz invariance, 236e; separation 
of, for different polarizations in gravitational 
waves, 566; on a sphere, 148, 645; spin 1, 256; 
teleological behavior of, 139; test, 302, 309; wimps, 
522; worldlines of, 177f, 211f, 380 

partition function of quantum systems, 445 

passive diffeomorphism, coordinate transformation 
as, 398 

path: in 2-dimensional Cartesian space, 123; 
actual, extreme value of action, 141; choosing, 
as metaphor for life, 139-140; of falling apple, 
determination of, 137; Feynman’s, to rescue a 
drowning girl, 3-4, 4f; harmonic oscillator, 148e; 
least path principle, 3, 5-6; length of, 189, 190; of 
light, 175, 656, 665; mean free path of photons, 
517; shortest (see shortest path); straight and 
narrow, deviation from, 143; through spacetime, 
638. See also distance; length 

path integral (Dirac-Feynman) formulation: 
determining Hawking radiation, 445; and local 
observables, 772; quantum gravity, 781, 783; 
quantum physics, 770; understanding of quantum 
mechanics via, 141 

Pauli matrices, in Lorentz algebra, 187 

Pauli spinors, as “square root” of Lorentz vector, 731 

Peierls, Rudolf, on thinking and calculating, 133 

Penrose, Roger, and twistors, 730-731 

Penrose diagram, 427-429; black hole formation, 
430; for causal structure of de Sitter spacetime, 
639f; charged black holes, 480f; de Sitter 
spacetime, 638; of Minkowskian spacetime, 428f; 
Schwarzschild black hole, 429f; time translation, 
620 

Penrose process, 449, 469-471; angular momentum 
loss, 471-472; area theorem, 472 

Penrose’s vision, on role of light rays, 741 

Penzias, Arno, cosmic microwave background, 517 


perfect fluids: assumptions during discussion, 237n; 
and comoving observers, 229; definition of, 230; in 
outgoing brane wave model, 704—705; relativistic 
stellar interiors, 451; universe filled with, 492-493 

perihelion shift: around black holes, 413; around 
Mercury, 368-369, 369f; in Schwarzschild metric, 
371-372 

perturbation, relevant, in cosmic diagrams, 512 

perturbative correction, to electromagnetic scattering 
of point charges, 766 

perturbative expansion, failure of, 770 

perturbed spacetime metric, gravitational sources of, 
569 

Petrov notation, of Riemann curvature tensor, 352e 

phase angle, of wave function, in Kaluza-Klein 
theory, 678 

phase boundaries, in cosmic diagrams, 513-514 

philosophic arguments, power of, 779 

photons: collisions with electrons, 222f; compared 
to gravitons, 768; decoupling of, and matter 
dominance, 788n; depending on geodesics, 665; 
frequency shift, in scattering, 222; momentum of, 
232; movement along time axis, 665; movement 
in dueling thinkers experiment, 7-9; parameter 
choice for propagating, 215; in primeval universe, 
516f; relativistic action, 212; role of electric charge 
for, 383; spherical shell of, 429, 430f; temperature 
of gas of, 495. See also light; massless particles 

physical momentum, and twistors, 731 

physical reasonability, 557 

physical singularities: compared to coordinate 
singularities, 91-92; and coordinate singularities, 
365-366; Kerr black holes, 467, 467f: 
Schwarzschild black holes, 418, 425; timelike, 
479 

physicists: good versus great ones, 167; particle, 
renaming themselves high-energy physicists, 
713n; physics being independent of, 219 

physics: on cosmological distance, 750; cube of, 12— 
13; Descartes approach to questions in, 583n; 
effectiveness in understanding the universe, 779; 
and expression of physics in terms of equations, 
difference of, 47; fundamental, Mother of All 
Headaches, 699; goal of, 757-758; independence 
of physicists, 219; internal consistency of, 780; 
linkage between high energy and low energy 
physics, 752; most famous equation of, 220-221; 
need to be local, 757; present understanding of, 
712; quantum (see quantum physics); relevance of 
topology to, 728; role of clocks and rulers, 719- 
720; role of observer, 46-48; sensitive to topology 
of spacetime, 720; start of, 143n; teleological 
discussions in, 136; theoretical (see theoretical 
physics); translation invariance of, in static 
spacetime, 304f; ultimate equation of, 47-48 

physics terms, least appropriate, 516 

Pioneer anomaly, 311 
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pions: formulation of strong interaction, 785; mass 
prediction of, 205; negatively charged, 206 

Pisa, Leaning Tower of, 270 

planar coordinates, of expanding universe, 630 

Planck, Max: Einstein’s appraisal of his 
understanding of general theory of relativity, 370; 
personal life, 10; and ultraviolet catastrophe, 
789n 

Planck area, and entropy of black holes, 442 

Planck brane, 700 

Planck constant, 11; dependence on mass-energy 
scale, 781 

Planck length, 11-12; charge quantization in 
Katuza-Klein theory, 677; in effective field theory 
approach, 709; and large extra dimensions, 699; 
as minimum length to probe quantum effects, 
762; as smallest distance experimentalists can 
measure, 764 

Planck mass, 11-12; amount of, 583; and 
cosmological constant paradox, 746-747; in higher 
dimensional theories, 681; in Katuza-Klein theory, 
675; as largest mass fundamental physics, 748; 
quantum gravity limit, 444 

Planck scale, in early universe, 518 

Planck time, 11-12 

Planck units: and entropy of black holes, 441; and 
quantum gravity, 761 

plane: flat, curvature of, 105; osculating, of smooth 
curve, 97£ 

planetary orbits, in Schwarzschild metric, 371-372 

planets, celestial mechanics, 28 

Poincaré, Henri: and Lorentz transformation, 169n; 
and special relativity, 190; understanding of waves, 
783-784 

Poincaré algebra: extension to conformal algebra, 
617; generators of, 192 

Poincaré coordinates: in anti de Sitter spacetime, 
656; numbers of boundaries, 664 

Poincaré group, transformations and translations of, 
666 

Poincaré half plane, 67f; and anti de Sitter / conformal 
field theories (AdS/CFT), 68; determination of 
geodesics, 127; with differential forms, 608; 
finding geodesics of, 133; geodesics on, 134f; in 
higher dimensions, 656; and metric, 67-68; and 
temporal boundary, 632 

Poincaré invariant brane, 707 

point charges, electromagnetic scattering of, 766 

point of view, local versus global, 141 

point particles: action for, relativistic, 208-209, 210; 
associated current of, 235; energy and momentum 
of, 379-380; motion of, 714, 676; nonrelativistic 
action, 241 

point-to-line map, from twistor space to spacetime, 
742 

pointlike particles, worldline length of, 215 

points: circles mistaken for, 674f; distance of in space 


848 | Index 


points (continued) 
and time, 174-175; isometric geometry of, 585; in 
spacetime, 177, 689f, 742; in twistor space, 741f 

Poisson’s equation: gravitational potential satisfying, 
708; membrane shape, 118; for Newton’s gravity, 
231 

polar coordinates: change from Cartesian 
coordinates, 29, 62,71; Christoffel symbols of, 129; 
on flat plane, 125; to solve celestial mechanics, 29; 
transformation into locally flat coordinates, 89; 
warped, 613e 

polar-like coordinates, comoving, 298 

polarization tensor, of gravitational waves, 565 

polarization vectors, written in terms of helicity 
spinors, 735n 

polarizations: degrees of, in gravitational waves, 
564; different, in gravitational waves, 566; of 
gravitational waves, 734; as helicity states of 
graviton, 566 

Polchinski, J., deletion of Feynman diagrams, 756 

pole in the barn problem: spacetime view of, 202f; of 
special relativity, 201, 201f 

poles, and their longitudes, 76 

polyhedra, angular deficits of, 726-727 

polytopes in momentum-twistor space, and 
scattering amplitudes, 742 

position, of particles: as general space coordinate, 26; 
notation, 117 

position determination, and position of measuring 
device, 763-764 

positive cosmological constant, and expanding 
universe, 392 

post-Newtonian approximation: Einstein’s field 
equation in, 577; parametrized, 309-310, 311 

potential: central, 36; and consistency or integrability 
condition, 36; cosmic, 508-509, 508f; definition 
of, 35; electromagnetic, in fifth dimension, 677; 
external, translation invariance, 242; gauge, 
emergence of Yang-Mills theory, 688; gravitational, 
578; inflaton, 535f; introduced into relativistic 
action, 209; linear, 139; Newtonian, around 
black holes, 410-411; particles moving in, tensor 
notation of, 57-59; rotationally invariant, 150; 
translation invariant, particle movement in, 151; 
vector, 243, 248; Yang-Mills gauge, 682 

potential energy, of a marble in a bowl, 113 

potential energy functional, action principle, 146 

power series: expansion of functional, 115-116; 
introduction of, 41 

powers of derivatives, deviation from Newtonian 
gravity, 708-709 

Poynting vector, emergence of, 382 

PPN (parametrized post-Newtonian) approximation, 
309-310, 311 

precession: of gyroscopes, 465, 549-551; Lense- 
Thirring, 550; in Schwarzschild spacetime, 
549 


precession angle, 550 

predictions, verified for Einstein’s theory, 777 

pressure: Fermi, Chandrasekhar limit, 455; 
relativistic energy contribution of, 230; of 
universe, relation to energy density, 359 

pressure gradient: of relativistic stellar interiors, 453; 
in universe filled with perfect fluid, 493 

primed coordinates, 18, 38; metric with, 71-73 

primed frames, in algebra, 196 

primeval nucleosynthesis, 517-518 

primeval universe, 516f 

“primeval” vectors, and coordinate transformations, 
73 

Princeton University, fundamental physical 
equations on glass windows, 138 

principles: action (see action principle); anthropic (see 
anthropic principle); of causation, and gravitation 
law, 404; Copernican, 491; cosmological 
(see cosmological principle); equivalence (see 
equivalence principle); fundamental, 12; Galileo’s 
relativity principle, 17-19, 159; “golden” guiding, 
338; holographic (see holographic principle); least 
path (see least path principle); least time (see least 
time principle); locality (see locality); of presumed 
innocence, 299; uncertainty (see uncertainty 
principle) 

problem: of not enough time, 521-522, 531; 
prototype of solutions, 222 

Professor Flat: discusses Christoffel symbols, 
132-133; on local flat coordinates, 130 

projection: stereographic, 80-81e, 81f, 641; of vectors 
on tangent plane, 102 

projective space, integrating over, 740 

promotion, law of, 219 

propagator, for graviton, 573 

proper distances, 296-297 

proper time, 181; definition of and motion of light, 
659; for different observers, twin paradox, 189; in 
electromagnetism, from special relativity, 244; in 
Minkowskian spacetime, 179; parameter choice 
for massless particles, 215 

proper time duration, of particle, 210 

proper time interval, invariance of, 199 

proton decay, 527; analogy to cosmological constant 
paradox, 753-754; and anthropic principle, 757 

protons: delayed recombination of, 516-517; 
primeval nucleosynthesis of, 517-518 

Proust, Marcel, on time, 205 

pseudo-Euclidean spaces, 653 

pseudo-time coordinate, 657 

pseudospheres. See hyperbolic spaces 

pseudotensor, energy momentum, 386 

psychological time, 175n 

Ptolemy: and concept of coordinates, 62n; and the 
term “second” used in measuring angles, 368n 

pulsars, emission of gravitational waves, 563 

pulsating mass distribution, 571 


pulsating stars, 304 

punctured surfaces, 726 

puzzle, “What is greater than God?” 789n 

Pythagoras: calculation of length of hanging string, 
113; motion in static isotropic spacetime, 305, 309 

Pythagoras theorem, for space and time, 167 

Pythagorean time, and radar echo delay, 372 


QFT. See quantum field theory 

quadratic derivatives, added to Lagrangian, 338 

quadrupole formula, derivation of, 576e 

quadrupole radiation, gravitational, 571 

quantities: auxiliary, calculus, 129; conserved: 
and Noether’s theorem, 30, 152; definition of 
conceptually natural, 219; extensive, 441; index 
notation of, 32; natural, introducing to gravity, 
764; transformation of, 218; without qualities, 
scalar fields as, 788n 

quantization: of electromagnetic field, 764; of 
gravitational field, 582; of gravity, 780, 788n 

quantum: dependence of action, 783; mystery of, 780 

quantum chromodynamics, 526, 785 

quantum electrodynamics, difficulties in, 764 

quantum field theory (QFT), 247; antimatter in, 
476; calculating vacuum energy by, 752-753; 
commutation relations, 192; correspondence 
with quantum statistical mechanics, mystery of, 
445; cube of physics, 13f; in curved spacetime, 
780; cutoff in, 758n; in de Sitter spacetime, 648; 
harmonic oscillator in, 361; as low energy effective 
theory, 711-712; motivation for development 
of, 384; motivation for studying twistors, 731; 
not consistent with classical relativity, 773n; 
questions on, 781; restless vacuum in, 436-438; 
understanding of, 746; use of scalar fields in, 759n 

quantum fields, appearance in action, 213n 

quantum fluctuations: contributing to vacuum 
energy density, 746; of fields, 784; Hawking 
radiation originating from, 436; in inflationary 
cosmology, 533; thermal radiation from horizon, 
637; Unruh effect, 446; vacuum as boiling sea of, 
745-746 

quantum gravity: anti de Sitter spacetime, container 
for, 649; cube of physics, 13f; divergent behavior 
of, 766; fundamental scales, appearance of, 760— 
761; governed by attractive ultraviolet fixed point, 
773n; handwaving arguments for, 769; Hawking 
radiation, 439, 443-444; heuristic thoughts about, 
760-774; as Holy Grail of physics, 12; impossibility 
as a quantum field theory, 765; as local field theory, 
781; local observables, absence of, 772, 781; loop, 
772; mystery of, 748, 781; “naked” singularities, 
480; Newtonian potential, corrections to, 767; 
nonperturbative treatment of, 770; path integrals, 
781; Planck length as minimum length to probe, 
762; and problem of knowing the position of 
measuring device, 763-764; and Schrédinger’s cat 
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experiment, 771; and “strangeness” of black holes, 
764-765; taming of, 731; thought to follow from 
quantum electrodynamics, 764; trouble by Planck 
mass, 761; and ultraviolet completion, 765; from 
world described by “matter fields” and a metric, 
770. See also Katuza-Klein theory; string theory 

quantum Hall effect, fractional, 789n 

quantum Hall fluid, and ground state degeneracy, 
723 

quantum hydrodynamics, analogy to quantum 
gravity, 759n 

quantum mechanics: cube of physics, 13f; derivation 
of Hawking temperature, 445; special relativity 
and, 437; spin 1 particles, 256; use of operators in, 
48 

quantum of gravity. See gravitons 

quantum of light. See photons 

quantum operators, 771, 772 

quantum particles, in classical gravitational field, 771 

quantum physics: difference from classical physics, 
360-361; discord with Einstein gravity, 768-769; 
equivalent formulations for, 770; observables in, 
772. See also physics 

quantum statistical mechanics, correspondence with 
quantum field theory, mystery of, 445 

quantum systems: on torus, 723n; with zero 
Hamiltonian, 723 

quantum tunneling, and Hawking radiation, 449 

quarks: baryogenesis, 526; families of, 786; masses 
of, and anthropic principle, 757-758 

quotient theorem, 316-317 


r, use of letter in different situations, 95 

radar echo delay experiments, 373f; as test of Einstein 
gravity, 372-373 

radar ranging, 291 

radial coordinates, hyperbolic, 653-654 

radiation: accretion disk, compared to nuclear 
fusion, 415; background (see cosmic microwave 
background); black body, of black holes, 436; 
Gibbons-Hawking, 449, 638; Hawking (see 
Hawking radiation); quadrupole, graviton spin, 
571; role in dissipative collapse, 521; from rotating 
black holes, 473-475; thermal, from de Sitter 
horizon, 637; universe dominated by, 495-497 

radiation density, and scale factor of universe, 496f 

radion field, 680; calculation of 5-dimensional scalar 
curvature, 686 

radius, role of, in Schwarzschild metric, 364-365 

rapidity, of boosts, 188 

Raychaudhuri equation, 449, 555-556 

A la recherche du temps perdu (Proust), 205 

recombination, delayed, 516-517 

rectilinear container, infinitesimal, 80e 

redshift: cosmological, 295; gravitational, 259, 282— 
283, 303-304, 412; infinite, outside Kerr black 
holes, 462, 466, 469; relativistic, of frequency, 186 
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redshift factor, 295 

redshift formula, 299, 490; as cosmic clock, 504 

reducible representations, 54-57 

Reed, Ishmael, stellar nucleosynthesis, 518 

reference frames: change of, and covariant derivative, 
103; comoving, preferred flow direction, 230; 
different, for Einstein’s clocks, 167f; dueling 
thinkers experiment, 7-8; and falling ring of 
balls, 59f; nearby, connected by 1-forms, 600; 
orthonormal, 594, 595f. See also observers 

reflections, space, 721n. See also rotations 

refraction, as principle phenomenon, 4 

Regge calculus, 726n 

Reissner-Nordstrém black holes, subextremal, 483 

Reissner-Nordstr6m spacetime, 477-478 


relativistic action: accelerated frames, 285; 
gravitational time dilation, 284; matrix theory, 210 

relativistic completion, 218, 242-243; of current, 223 

relativistic curl, 4—vector, 252 

relativistic Doppler shift, 185-186, 222 

relativistic fluid dynamics, 233 

relativistic kinematics, 221 

relativistic matter. See also radiation 

relativistic particles. See massless particles 

relativistic stellar interiors, 451-457 

relativistic strings, generalization of action for, 210n, 
215 

relativistic unification, 247 

relativistic wave equation, standard, 565 

relativity: in American football, 172f; concept of, 17— 


20; definition of, 17; Galileo’s principle of, 17-19, 
159; general (see general relativity); numerical, 
400-405, 693; special (see special relativity) 

relativity principle, Galilean, 159 

relevant events, time dilation, 197 

relevant perturbation, in cosmic diagrams, 512 

relic photons, 517 

relic problem, 532 

renormalizable interactions, 711-712 

renormalization group flow, 511 

renormalization group ideas, and scaling, 754 

reparametrization invariance, variational calculus, 
123 

repeated index summation. See summation 
convention 

representation: ambitwistor, 736, 739; defining, of 
rotation group, 54; fundamental, of rotation group, 
54; of groups with subgroups, 225; index-free, of 
vector fields, 319; reducible versus irreducible, 
54-57 

representation theory, 54 

repulsion, between like electric charges, 707 

rescaling: of complex parameters, 733; invariance 
on, 559 

rest frame: of gyroscope in parallel transport, 549; 
length contraction, 199; with proper time, 179 

restless vacuum, in quantum field theory, 436-438 


restrictions: of groups to subgroups, 57; by Lorentz 
symmetry, 339; of metric, by isometric condition, 
586; of momentum, to hyperbolic shell, 220 

Ricci-Curbastro, Gregorio, 345 

Ricci tensor, 449; for 2-brane model, 701; in anti de 
Sitter spacetime, 612; calculation of 5-dimensional 
scalar curvature, 685; for charged black holes, 478; 
combined with scalar tensors, 388; computation 
of, 357-358, 362; cosmic expansion, 490-491; 
derivation of Raychaudhuri equation, 556; 
introduction of, 345; proportional to metric, 
492; for relativistic stellar interiors, 451-452; in 
Schwarzschild solution, 363-364; for spherically 
symmetric static spacetimes, 611; vanishing of, 
348; variation of, 390, 395 

Riemann, Bernhard: determination of curvature 
of space, 65; pioneering work in extending 
differential geometry, 91; quest for curvature, 339 

Riemann curvature: components of, in Einstein 
gravity, 89; as found by Riemann, 90-91; and 
parallel transport, 545. See also curved spacetime 

Riemann curvature tensor, 546; alternative 
derivation of, 547-548; anti de Sitter spacetime, 
651; computation of, 349-350, 362, 607; 
constraints on, 591; cyclic symmetry of, 351e; for 
de Sitter spacetime, 626; derivation of variation 
of, 389; determination of, 341-343; directly 
from 2-form, 611; form of, 90; formation of 
scalar curvature from, 345-346; on geodesic, 
in Fermi normal coordinates, 560; Hawking 
Radiation, 438; indices, number of, 131; of Kerr 
metric, 476; in locally flat coordinates, 553; in 
maximally symmetric spaces, 589; structure of, 
351; symmetry properties of, 343, 561; vanishing 
of, 348; variation of, 347; and variations of metric 
in spacetime, 716 

Riemann normal coordinates. See locally flat 
coordinates 

Riemannian, manifolds, 599-600 

Riemannian geometry, 280; determination of weak 
field action, 572; fear of, 82 

Riemannian manifolds: Cartan formulation of, 
601; choice of metric, 88; definition of, 95; 
Killing vectors of, 588; nearby geodesics on, 552; 
specification of curvature of, 89 

Riemannian spacetime: fundamental scalars in, 365; 
generalization of parallel transport to, 543 

Rindler coordinates, 193f, 660 

Rindler metric, 446 

Rindler transformation, in Minkowski spacetime, 
192e 

ripples in spacetime, 563; propagation of, 667 

RNA folding, and punctured surfaces, 728 

Robertson, Howard P.: rejecting Einstein’s article, 
564. See also Friedmann-Robertson-Walker 
universes 

Rogers, Eric, neighbor of Einstein, 267 


Rosen, Nathan: Einstein-Rosen bridge, 433; 
gravitational waves, 563n 

Rosenfeld, L., nonconsistency of quantum field 
theory and classical relativity, 773n 

rotating black holes, 414, 458-476; angular 
momentum of, 576; and Boyer-Lindquist 
coordinates, 78; frame dragging, 460f; offdiagonal 
metric component, 459; Penrose process, 449; as 
sources of radiation, 473-475; ’t Hooft’s bound, 
442. See also Kerr black holes 

rotating bodies: angular momentum of, 563-577; 

slow velocity of, 570; spacetime deformation by, 
460 

rotating mass distributions, 569 

rotation groups, 317; generalized, 191; generators of, 

40, 192; as invariance group of physics, 755; Lie 

algebra of, 191; representation of, 54; subgroup of 

Lorentz group, 192. See also specific groups 

rotation matrix: and covariant differentiation, 321; 

definition of, 38 

rotational invariance, 118; inverse square law, 120, 

697; in Newton’s second law, 140 

rotations: approach generalizing to higher 
dimensional spaces, 42; under coordinate 


transformations, 72—73; definition of, in matrix 
form, 40; determination in manifolds, 590; and 
exponential function, 41; as freedom left in higher 
dimensional space, 88; in higher dimensional 
spaces, 44-45, 49-51; hyperboloid of, 625; and 
index notation, 44—45; as invariant transformation, 
186; as linear transformations, 68; order of, 50f: 
in plane, 38-40; similarity to metrics, 181; in 
spacetime, 174 

Roth, Philip, The Ghostwriter, 254 

Royal Society, expeditions to test Einstein’s theory, 
367 

rubber sheet analogy, misleading for black holes, 432 

rulers, observed in different frames, 199f 

Rumford, Count (Benjamin Thomson), energy 
conservation, 387n 


saddle point, determination of surface curvature, 
105f 

Sakharov, Andrei D., grand unified theory, 529 

Sandage, Allan, closed and open universes, 296-297 

satellites: onboard gyroscope measurements, 549; 
radar echo delay experiments, 373 

scalar action, energy momentum tensor for, 387e 

scalar check, of Schwarzschild metric, 365 

scalar curvature: for 2-brane model, 700; 5- 
dimensional, 684-686; constant, of maximally 
symmetric spaces, 589; of expanding universe, 
609; formation from Riemann curvature tensor, 
345-346; and mass dimensions, 711; and other 
coordinate scalars to form a metric, 708-709 

scalar fields: action, 332; in AdS/CFT correspondence, 
665; charged, in 5-dimensional theories, 687; 
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Lagrangian in, 712; as quantities without qualities, 
788n 

scalar product: of four vectors under Lorentz 
transformation, 182; invariant in parallel 
transport, 544; of vectors, definition of, 39 

scalar tensors, combined with Ricci tensor, 388 

scalars: and coordinate transformations, 73; 
differentiation, 318; in general relativity, 315; 
and invariance, 47; objects without indices not 
transforming as, 719-720; rotational, 225 

scale and conformal invariances: and naturalness 
doctrine, 750; in particle physics, 621 

scale factor of universe, 289, 293, 489; and Big 
Bang, 499f; cosmological equation, 633; and 
energy density, 496f; in inflationary cosmology, 
534; primeval density fluctuations, 524; redshift 
formula, 299 

scales, 750; physics on different length, 750 

scaling: at cosmological distances, 753-754; metric 
invariant under, 657 

scaling dimensions, of terms in action, 713n 

scattering: 4-gluon, 744e; Compton, 222£, 235e; 
electromagnetic, of point charges, 766; of 
electromagnetic wave on atom or molecule, 
715; gluons, Feynman diagrams for, 735-736; 
of gravitons (see graviton scattering); impact 
parameter for, 309, 309f, 416; of neutrinos, 
765; particle, Lorentz invariance, 236e; photons, 
frequency shift in, 222 

scattering amplitudes: 4-gluons, 738; ambitwistor 
representation for, 737; dimensional analysis, 
717, 761, 770; and effective field theory, 770; 
expressed in terms of helicity spinors, 734— 
735; Fourier transformation of, 736; gluons, 
785; for gravitational wave on finite sized 
object, 717; gravitons (see graviton scattering); 
spacetime hidden in, 739-740; in terms of helicity 
spinors, 735-736; as volume of polytopes in 
momentum-twistor space, 742 

scattering cross section, electromagnetic wave on 
atom or molecule, 715 

Schild (Kerr-Schild form), 476 

Schrédinger’s cat experiment, quantum gravity, 

771 

Schrédinger equation, for (nonrelativistic) charged 

particle in magnetic field, 354n 

Schwarzschild, Karl: letter to Einstein, 362; meaning 

of name, 363 

Schwarzschild black holes: escape from, 427; 
Hawking temperature of, 436; and Kerr black 
holes, 468; Kruskal-Szekeres coordinates, 
635; Kruskal-Szekeres diagram of, 426f; mass 
determination, 570; Penrose diagrams, 429f 

Schwarzschild-de Sitter spacetime, 375e, 635 

Schwarzschild-Droste metric, and solar system tests 
of Einstein gravity, 362-371 

Schwarzschild metric: derivation of, 347; discovery 
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Schwarzschild metric (continued) 
of, 364; Kruskal-Szekeres diagram for, 425— 
427; near-horizon, 445-446; Painlevé-Gullstrand 
coordinates, 417; perihelion shift in, 371-372; 
planetary orbits in, 371-372 

Schwarzschild radius, 409; in Kerr solutions, 461, 
465; relation to actual radius, 366; role in metric, 
364-365; of universe, 514; and universe’s obesity 
index, 443 

Schwarzschild singularity: coordinate, 365-366; 
impossibility of, in toy model of spherical cluster 
of noninteracting particles, 376n 

Schwarzschild solution: as limit of Kerr solution, 
466; with charged central mass (see charged black 
holes); Kruskal coordinates as extension to, 433; 
time-dependent mass distribution, 374; Weyl’s 
way to, 374 

Schwarzschild spacetime, 292; precession in, 549; 
spherical shell of photons in, 430f 

“second,” meaning of term used in measuring 
angles, 368n 

second derivative in time, role for dynamics, 
Newton’s insight, 401 

second law of black hole thermodynamics, 472 

second order corrections, to locally flat Euclidean 
metric, 88 

segments, infinitesimal, space and time experience 
of, 180 

self-interacting scalar field, 387e 

self-tuning, 706 

semi-circles, as geodesics, 133 

Shapiro, Irwin I., radar echo delay experiments, 
372-373 

sheets, swept out by strings, 216f 

“shift,” 691, 693 

shortest path: in curved spacetime, 276; 
determination of, 155; on earth’s surface, 
275; and parallel transport, 545; in spacetime, 
176f. See also geodesics; path 

“shut up and calculate,” 445 

sign: most significant in physics, 176; role in 
electromagnetism, 382 

sign error, in action variation, 380 

sign function, in Green’s function, 573 

signature, of spacetime, changing of, 732-733 

Silberstein, Ludwik, understanding of Einstein’s 
theory, 369-370 

similarity transformations, definition of, 56 

simultaneity: dependence on observer, 8; Einstein’s 
gedanken experiments, 7—9; failing of, 166; fall of, 
200 

single particles, ignoring gravitational waves, 566 

singularities: at Big Bang, 498; clothed, 479; 
coordinate, 91-92, 365-366, 467, 467f; physical, 
418, 425, 467, 479; at poles of Mercator map, 
365; Schwarzschild, impossibility of, 376n; at 
Schwarzschild radius, 409; of Schwarzschild 
solution, 365; spacetime, at Big Bang, 498; 


spherical, paper by Kruskal, 376n; with trapped 
surfaces, 484 

sink, in cosmic diagram, 511 

sky, reason for being blue, 715 

SL(2, C) group, 730 

SL(4, R) group: explanation of, 739; and twistors, 
737 

slow roll scenario, 535-536 

slow rotation limit, Kerr black hole, 571 

smooth functions, and delta function, 33e 

Snell’s law, 9e 

SO(3, 1) group, 730 

SO(3) group, generators of, 44 

S$ O(3) transformations, 57f 

SO(6) group, 619 

SO(D) group: index notation of, 49; Lie algebra for, 
51; Minkowski spacetime, 191 

soft photon theorems, 217n 

solar eclipse expeditions, 367; praise by J. J. 
Thomson, 369 

solar system, tests of Einstein gravity, and 
Schwarzschild-Droste metric, 309, 362-371 

Soldner, Johann, calculation of deflection of light by 
astrophysical objects, 366-367 

solid state structures: gauge potential of, 721. See also 
condensed matter physics 

solitons, included in quantum field theory, 781 

SO(m, n) groups, and complexification, 732 

Sommerfeld, Arnold: introduction of fine structure 
constant, 767; letters from Einstein, 344, 366, 580 

sound horizon, 524 

sound speed: in metals, ratio to light speed, 749; in 
static relativistic fluid, 234 

south-pointing carriage: function of, 109; modern 
version of, 104f 

south pole, and its longitude, 76 

space: closed curved, 681; conformally flat, 80-81e; 
creation of, 498, 787; curled up, 673-674, 674; 
determination of curvature, 65-66; dimensionality 
and inverse square law, 697; homogeneous, 289, 
292, 491, 588, 704; hyperbolic, 296, 491, 590, 627, 
633; internal, 688, 689; isotropic, 289, 292, 305, 
491, 588, 704; local versus global character of, 76— 
77; maximally symmetric, 585-593, 588; metric 
in geodesic equation, 128; negatively curved, 
maximally symmetric, 610; replacing time, 137; 
and spacetime, classification of, 666; of spheres, 
and de Sitter spacetime, 646; spherical, of closed 
universe, 633; and time, lyrical confounding of, 
174n 

space coordinates: as dynamical variable, and energy 

momentum tensor, 381; notation, 25 

pace measurements, metric tensor for, 63-64 

pace reflections, in odd-dimensional space, 721n 

pacelike 3-dimensional hypersurface, 693f 

pacelike curves, 175 

pacelike distance, 175 

pacelike events, temporal ordering of, 204 


spacelike geodesics, tentacles of, 558 

spacelike hypersurfaces. See Cauchy surface 

spacelike infinity, 428, 428f 

spacelike Killing vector, 637 

spacelike surfaces, 184 

spaceship: ball of whiskey in, 270; in orbit around 
earth, 266 

spacetime(s): 4-dimensional, divergence theorem 
generalized to, 386; 5-dimensional (see Katuza- 
Klein theory); annihilated, 785; anti de Sitter, 
612, 702; boundary of, 399n; causal structure of, 
427, 431, 438, 530, 531f, 780; changing signature 
of, 732-733; circles mistaken for points, 674f; 
conformally equivalent, 311; conformally related, 
622e; constancy of dark energy density in, 359; 
constructed by piling sheets, 689f; curved (see 
curved spacetime); dark energy density in, 356; 
de Sitter, 456, 624-648; deformation by rotating 
bodies, 460; discretization of, 773n; disguises of 
anti de Sitter, 654; distance measurements in, 
180; distance of comoving observers, 174; divided 
into regions, 635; Einstein’s equivalence principle, 
271; empty, 347-348, 362; event, definition 
of, 177; flat, with conformal algebra, 615; four 
dimensional, 174; geometry of, 174-193; and 
gravity, origins of, 787; how to generate, 338; 
human in, 658f; inside stars, 453; inversion of, 
743-744; isometric, around rotating black holes, 
459; Kerr, 470-471, 473; Lagrange multiplier 
for volume of, 756; and large extra dimensions, 
697; mapping of, holographic principle, 649; 
Minkowskian, 277, 434; Minkowskian metric for, 
181; next steps of understanding of, 784; null 
lines in, 741f; number current as 4—vector in, 
225f; paths lengths, 189; perpendicular to internal 
space, 689; propagation of ripples, 667; pulsation 
communicated to outside, 571; Pythagoras 
theorem of, 167; regions of, 635; ripples in, 
563; Schwarzschild, 292; Schwarzschild—de 
Sitter, 375e; separation of events in, 160; “sewing 
together” of two distinct, 429-431; shortest path in, 
176f; singularity at Big Bang, 498; small enough 
region of, and Einstein’s equivalence principle, 
712; spherically symmetric, time dependent, 311; 
around spherically symmetric mass distribution, 
304-307, 310-311, 409; spinors in curved, 604— 
605; static, 61, 303-304; static isotropic, motion 
in, 306-307; stretching of, 615; thermodynamics 
of, 448-449; topology of, physics sensitive to, 720; 
as a triangle, 428, 434; twistor space point-to- 
line mapped to, 742; and twistors, 739-740; and 
variations of metric in, 716 


spacetime curvature. See curved spacetime 

spacetime derivative, two powers of, role in Einstein 
field equation, 402 

spacetime dimensions, four, 174 

spacetime events, light rays being more fundamental 
than, 741 


Index | 853 


spacetime fluctuations, 762 

spacetime metric: around spherical mass 
distribution, Schwarzschild solution, 363-364; 
around stars, 62; formal similarity to rotation, 181; 
notation of, 183; perturbed, gravitational sources 
of, 569. See also metric 

spacetime picture, thinking in terms of, 28 

spatial boundaries, 655; in anti de Sitter spacetime, 
649 

spatial coordinates: in continuum mechanics, 117; 
emerging in AdS/CFT correspondence, 787; 
growing from boundary, 660; and location of 
particles, difference between, 31 

spatial curvature: for closed, flat, and open universes, 
634; effect on CMB fluctuations, 525-526 

spatial distance, in general curved spacetime, 
290-292 

spatial metric, and cosmic expansion, 491 

special matrices, 40 

special relativity: abstract of, 20; accelerated particles, 
193e; applied, 195-206; counterintuitivity, 204; 
electromagnetism from, 244-246; in everyday 
life, 205; geometrical view of, 582; pedagogically 
correct presentation, 203; performance of young 
Einstein, 783; problems in, foolproof method for 
solving, 195; and quantum mechanics, 437; time, 
different rates of, 196 

speed limit, existence of, 172 

speed of light. See light speed 

spheres: in 3-spaces, distances of, 610; curvature of 
surface of, by Gauss’s strategy, 105; d-dimensional, 
definition of, 624; in de Sitter spacetime, 624; 
determination of metric on, 65; as example 
for curved space, 83; generalized, 92; higher 
dimensional, metric of, 80e; and hyperbolic 
spaces, 93; “at infinity,” 428; intersecting, 647f; 
intrinsic and extrinsic curvature of, 6, 85; metric 
of surface of, 83-84; squashed, 469; stereographic 
projection of, 80-81e, 81f; tangent plane of, 98; 
topology of, 727; unfamiliar metrics of, 585 

spherical blobs, 725f; growing a trunk, 726 

spherical coordinates: change from Cartesian 
coordinates, Euclidean spaces, 63; introduction of, 
108 

spherical shell of photons, 429; in Minkowski 
spacetime and Schwarzschild spacetime, 430f 

spherical symmetry, comoving coordinates, 298 

spherically symmetric mass distribution, 304-307; 
around black holes, 409; Christoffel symbols, 
310-311; foliation, 305-306; Killing vectors 
for, 305; Schwarzschild solution for spacetime 
metric around, 363-364; time dependent, and 
Jebsen-Birkhoff theorem, 373-374. See also stars 

spherically symmetric spacetime: static, 61; time 
dependent, 311 

spin 1 particles, 256 

spin connection, for index transformation, 603 

spin fields, in terms of spinor field, 789n 
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spin vector: parallel transport of, 549; precession of 
for particle in orbit, 550 

spinor fields, and gauge potential, 789n 

spinor indices, “metric” for, 742 

spinors: complexified, 732; in curved spacetime, 
604-605 

splitting of energy levels. See energy level splitting 

spooky action, Newton’s, 146 

spring oscillations, equation of motion, 26 

square root: calculation of, 207; of Lorentz vector, 731 

squashed sphere, length of equator, 80e 

stacked entities, 56 

standard candles, 359 

standard model of particle physics, 683 

standard notation, of coordinates, 25 

standard relativistic wave equation, 565 

Stark, Johannes, on Einstein, 216n 

stars: collapse into black holes, 455-456; first, 519; 
made of nothing, 456; pulsating, 304; relativistic 
interiors, 451-457; Riemann curvature tensor 
around, 362; Schwarzschild radius of sun, 409; 
stellar nucleosynthesis, 518-519 

static coordinates, 634; definition of, 652; 

time-independence of metric, 636 

static fields, classical theory, 119 

atic isotropic spacetime, motion in, 306-307 


n 


static solutions, for coupled Einstein and Maxwell 
equations, 482-483 

atic spacetime, 303-304; spherically symmetric, 61; 
translation invariant physics in, 304f 

atic universe, Einstein’s, 509-510, 514 

ationary limit surface, 461-463; angular velocity 
inside, 471; Kerr black hole, 462f; outer, 469 

stationary phase approximation, 770 

“stationed” observer, around black hole, 412 


n 


nn 


stellar nucleosynthesis, and anthropic principle, 758 

stereographic projection: for anti de Sitter spacetime, 

661; for de Sitter spacetime, 641; of sphere, 

80-81e, 81f 

straight line: appearance of, curved coordinates, 
130-131; distance of, in Minkowskian spacetime, 
175; form dependence of coordinate systems, 
127; geodesic problem, solutions of, 124; most 
complicated description of, 125; and parallel 
transport, 545; as shortest path between two 
points, 4; in twistor space, 742; between two 
points, 66, 90 

stress energy tensor, 386e; in outgoing brane wave 

model, 704-705. See also energy momentum 


tensor 

string: action of, 146; boundary conditions for energy 
of, 115; elastic, hanging under force of gravity, 113; 
hanging, and variational calculus, 113-123; with 
nonuniform force distribution, 117; relativistic, 
action for, 210n 

string action, invariance of, 147, 216e 

string theory: and anthropic principle, 757; 
Bekenstein-Hawking entropy and, 444; current of, 


235; dilaton field in, 680; in early universe, 518; 
and extremal black holes, 467; and generalized 
uncertainty principle, 769; as higher dimensional 
theory, 695; and Katuza-Klein / Yang-Mills 
theories, 682-683; large extra dimensions in, 696; 
minimal, 147; sheets created from strings, 216f 

string vibrations, speed of propagation of, 147 

strong energy condition, 557; and gravity attraction, 
562n 

strong force, generated by pions, 205 

strong interaction, 526; understanding of, 785 

structural equations, Cartan’s, 684 

structure formation, in early universe, 520, 522-523 

subextremal black holes: charged, 478-479; 
Reissner-Nordstrom, 483 

subgroups, restriction of groups to, 57 

subscript, index notation, 32 

subtraction, of vectors, in Euclidean space, 101, 101f 

summation convention, 46, 184, 316; and general 
coordinate transformations, 71; in general 
relativity, 314; and Greek symbol notation, 63- 
64; Lorentz transformation of, 186; Minkowski 
metric, 182; and tensors, 52; and upper and lower 
indices, 64 

summation variables, dummy, 184n 

sums, notation of, Kronecker delta, 45 

sun: ratio of Schwarzschild radius to actual radius, 
367; Schwarzschild radius, 266, 409 

superb theorems, Newton’s, 33 

superconductivity, high temperature, 789n 

superrenormalizable interactions, 711-712 

superscript, index notation, 32 

supersymmetry: Bekenstein-Hawking entropy and, 
444; Yang-Mills theory, 621 

supertwistors, 739n 

suppressed angular coordinates, 422, 426 

surface curvature: compared to curved line, 89n; 
determination of, Gauss’s strategy, 104-105 

surface parametrization, 98 

surface vectors: basis for, in Euclidean space, 98; 
normal, 184; parallel transport of, 543 

surfaces: in 3-dimensional Euclidean space, 98— 
109; generated of light rays, 185; gravity at, 
473; “inside” and “outside” of, 85; metric on, 
in Euclidean space, 99; normal to, at certain 
point, 99f; “one way” in spacetime metrics, 185; 
punctured, 726; spacelike, 184; stationary limit, 
461-463, 469; tangent plane of, in Euclidean 
space, 98-99; trapped, 484, 789n; triangulation of, 
726 

Sylvester, James Joseph, 210; law of inertia, 193e 

symbolic manipulation software, computation of 
curvature tensor, 607 

symmetric mass distribution. See spherically 
symmetric mass distribution 

symmetric spaces, maximally, 585-593, 588; 
curvature tensor in, 589; negatively curved, 
610 


symmetric spacetimes, spherically, 611 

symmetric tensors, character of, 55 

symmetry: of angular momentum, 150; approach 
to fluid dynamics, 164; and conservation, 150- 
155; and curvature tensor, 561; cyclic, of Riemann 
curvature tensor, 351e; deduction of physics 
from, 254; dot notation, 129; and equivalence 
principle, 317-318; Fermi normal coordinates, 
561; and Fermi normal coordinates, 561; gauge, 
in higher dimensional theories, 682; and gauge 
invariance, 249; hidden, of nature, 210; imposed 
on gravity, 254; and invariance, 242-243; local 
gauge, in higher dimensional theories, 682; 
Lorentz, restrictions on electromagnetism, 
339; matter-antimatter, violation of, 528, 683; 
maximal, 592, 625, 626, 650; physical, definition 
of, 47; as property of tensors, 61; restrictions on 
Newtonian gravity, 339; of Riemann curvature 
tensor, 343, 561; in spatial indices, 609; of spheres 
in spacetime, 585; spherical, 298, 304-307, 310- 
311, 373, 409; supersymmetry, 444, 621. See also 
antisymmetry; rotations 

symmetry breaking, spontaneous, 593, 784 

symmetry group, Euclidean group as, 755 

symmetry relations, investigations of, in locally flat 
coordinate system, 343-344 

system of units, natural, 10-12 

Szekeres, George. See Kruskal-Szekeres entries 


’t Hooft, Gerard: bound on entropy of black 
holes, 442-443; naturalness doctrine, 750; and 
Yang-Mills field, 789n 

tangent plane: and curved surface of sphere, 83-84, 
83f; and normal to surface, 99f; rotating around 
normal vector, 100; of surface in Euclidean space, 
98-99 

tangent vectors: of curves, 96, 327; to geodesic, 
555; spacetime surfaces, 185; to straight lines, 
130 

tautochrone problem, Lagrange, 144 

Taylor, Joseph H., detection of binary pulsar, 563 

Taylor coefficients, and Riemann curvature, 91 

Tegmark, Max, inflationary cosmology, 536 

teleological discussions, in physics, 136 

temperature: ambient, of universe, 504; concept of, 
15; of cosmic microwave background, 515, 521- 
522; Hawking, 436, 441; inverse, 445; mystery of, 
15; for nonrelativistic gas, 231; of photon gases, 
495 

temporal boundary, and Poincaré half plane, 632 

temporal coordinate: in boundary theory, 660; 
dependence of spatial coordinate, 652 

temporal ordering: in antimatter creation, 206; in 
different frames, 204 

tennis ball trajectories, in space and spacetime, 33 

tensor, notation, Greek symbols in, 63 

tensor decomposition, 236e 

tensor density, definition of, 75n 
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tensor fields, 243; electromagnetic, 244; gravity, 257; 
introduction of, 53-54 
tensor notation: gravity potential, 57-59; Greek 
symbols in, 63; and Laplace’s equation, 58; 
Newtonian orbits, 60; particle motion, 57-59 
tensors: antisymmetric and symmetric character 
of, 55; construction of, 313; contraction, 316; 
covariant derivative as, 322; covariant derivative of, 
324; covariant divergence of, 332; definition of, 52; 
differentiation, 318; fear of, 52-53; form invariant, 
592-593; in general relativity, 312-319; and indices 
(upper and lower), 74; invariant, definition of, 59- 
60; Lie derivative, 328, 331; Lorentz, 188, 243; 
under Lorentz transformation, 193e; in Newtonian 
mechanics, 57-59; of polarization, gravitational 
waves, 565; and representation theory, 54; Ricci 
(see Ricci tensor); of slowly rotating bodies, 570; 
stress energy (see energy momentum tensor); 
symmetry properties of, 61, 343; trace of, 55; 
transformation of, 132; and vectors, interplay of, 
53-54 
entacles, consisting of spacelike geodesics, 558 
errestrial and celestial mechanics, Newton’s 
unification of, 28 
est, “1-2,” 326 
est particle, 302; PPN approximation, 309 
etrahedra: glued together, 725f; topology of, 725 
Theorema Egregium, 90-91 
heorems. See specific theorems 
heoretical physics: and cosmological constant 
paradox, 753; Einstein mode of, 778; fundamentals 
of, 783; “golden” guiding principle in, 338; impact 
of Einstein gravity, 777; unified perspective on, 
170. See also physics; quantum physics 
theories. See specific theories 
thermal radiation, from de Sitter horizon, 637 
thermocouples, Einstein’s ether detection, 
experimental set-up, 163 
thermodynamics: first and second law of, for black 
holes, 472-473; first law of, and pressure of 
universe, 360n; of spacetime, 448-449 
Thomson, J. J., praise for solar eclipse expeditions, 
369 
Thomson, Benjamin (Count Rumford), energy 
conservation, 387n 
Thoreau, Henry David, deeds for old and young 
people, 788 
thought experiments. See gedanken experiments 
tidal forces, 554; and finite sized objects, 716-717; 
gravitational waves, 567, 567f; introduction of, 59 
tilting light cones, at Schwarzschild radius, 420-421, 
421f 
time: in 4-dimensional matrix, 210; connection with 
gravity, 579; cosmic, 295, 530, 632; cosmological, 
in outgoing brane wave model, 706; cosmological 
problem of not enough, 521-522, 531; creation of, 
787; different rates of, in special relativity, 196; 
and gravity, 257-258; imaginary, in derivation of 
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time (continued) 
Hawking temperature, 445-446; lines of constant, 
637; in Minkowskian sphere, 631; mystery of, 
787; in Newtonian universe, 7; psychological, 
175n; and space, unifying, 174-175; specific, in 
integrals, 228-229; transit time, minimization, 
139; translation invariance in, 303-304; units for, 
10; unwound, 653 

time coordinates: multiple, 666n; notation, 25; two, 
652 

time delta function, 229 

time dependence: disappearing in static coordinates, 
635; Lagrangian without explicit, 153; of metric, 
455; in physics, 137-138; of spherically symmetric 
mass distributions, 373-374; of spherically 
symmetric spacetime, 311 

time dilation, 197; gravitational, 258-259, 284, 412, 
304; lifetime of particles, 198 

time evolution, of universe, 511f 

time evolution equations, importance in Newtonian 
mechanics, 400-401 

time reversal, strong gravitational sources, 574 

time reversal invariance, 416-417; accelerated 
expansion, 500 

time translation, in Penrose diagram, 620 

timelike curves, closed, 484; violating physics, 653 

timelike distances, 175 

timelike geodesics, 645; behavior of, 554-555; 
congruence of, 555; dense collection of, 555 

timelike infinities, 428, 428f 

timelike Killing vectors, 631, 637 

timelike physical singularity, 479 

Tinseau, D’Amondans Charles de, introduction of 
osculating plane, 97 

Tolman-Oppenheimer-Volkoff equation, 453, 457 

top ten worst physics terms, 767 

topological action, 720-721 

topological cylinder, anti de Sitter spacetime, 654 

topological field theory, 719-728 

topological invariants, 725-727 

topological quantization, 723 

topological terms, in gauge theories, 720-721 

topology. See differential forms 

torsion, of curves, 97 

torsion pendulum, and non-quantized gravity, 771 

torus, systems on, 723n 

total action, Newtonian world, 145 

total energy: conservation of, 35; Hamiltonian, 144 

total energy momentum tensor, disappearance of, 
394 

total momentum, conservation of, 37 

totally antisymmetric symbol, definition of, 50 

toy model, of spherical cluster of noninteracting 
particles, 376n 

trace: of matrix, and intrinsic curvature, 84; of tensor, 
55 

trans-Planckian cosmology, 518 

transextremal charged black holes, 478 


transformation invariance: of action principle, 147; 
of Poincaré coordinates, 657 

transformation matrix, 312; linearity, 313 

transformations, 80-81e; compared to variations, 
389; conformal, 614, 616; coordinate, 62, 68-70, 
564; Galilean, 18-20; gauge, as 5-dimensional 
coordinate transformation, 673; importance of, 
in theoretical physics, 75; infinitesimal, 187, 615; 
in Katuza-Klein theory, 672; as pervasive theme 
of theoretical physics, 68; under SO (3), 57f; and 
vectors, 42 

transit time, minimization of, 139 

translation, generators of, 644 

translation invariance, 242; of physics, in static 
spacetime, 304f; in time, 303-304 

translation operator, introduction of, 340 

transport, Lie, 328 

transpose: of matrix, 45; of vector or matrix, 
definition of, 39 

transverse-traceless (TT) gauge, 565 

trapped surface, 484; presence of, 789n 

triangulation, of a surface, 726 

trihedron, moving, of smooth curve, 97f 

trunk, grown from spherical blob, 726 

Tsai, Ming-liang, What Time Is It over There? 514 

TT (transverse-traceless) gauge, 565 

tunneling, quantum, and Hawking radiation, 449 

Twain, Mark, on truth of knowledge, 410n 

twin paradox, 189, 194e 

twistor space: analogs of Euclidean space objects in, 
742; geometry of, 741-742; point-to-line mapped 
to spacetime, 742; points in, 741f 

twistors: ambitwistor representation, 736; 
complexification of variables, 732; covered Lorentz 
group, 729-730; and Einstein-Hilbert action, 739; 
freedom to rescale, 733; geometric essence of, 
739-740; and interaction among gravitons, 738-— 
739; introduction to, 730-745; Lorentz invariance, 
734; motivation for studying, from quantum field 
theory, 731; polarization and helicity, 734; and 
power of helicity spinors, 735; and Roger Penrose, 
730-731; and SL(4, R) group, 737; and spacetime, 
739-740 

“Tycho Brahe day,” 369n 


ultimate theory, dream of, 789n 

ultrarelativistic particles. See massless particles 

ultraviolet catastrophe, 781; Planck and, 789n 

ultraviolet completion, of quantum gravity, 765 

ultraviolet regime, linkage to infrared regime, 752 

umveg test, In 

uncertainty principle, 206n; antimatter creation, 
205; generalized, 769; and Katuza-Klein theory, 
674; and minimum length, 763; quantum field 
theory, 437; and quantum gravity, 762; and the 
three natural units, 11-12; and zero point energy, 
745-746 

unification: fundamental interactions (see grand 


unified theory, string theory); of gravity and other 
interactions, 767-768, 780; relativistic, 247; weak 
interaction and electromagnetic interaction, 765 

unified language, for different physical phenomena, 
186 

unified notation, of Lorentz transformation, 
186 

unimodular gravity, and cosmological constant 
paradox, 755-756 

unit circle, length element on, 80e 

unit determinants, 40 

unit matrix, definition of, 39 

unit spheres, metric on, 80e 

unit tangent vector, of a curve, 96 

unitarization, and ultraviolet completion, 765 

units: change of using, 16n; of distance, 168; Hubble, 
293; for length and time, 10; natural system of, 
10-12; royal and “revolutionary,” 163n. See also 
Planck units 

universal clock: in Newtonian physics, 25; set-up of, 
172 

universality of gravity, 258, 269-270; curved 
spacetime, 275-276 

universe: 2-dimensional map of, 506-507; acausality 
of, 754; acceleration or deceleration of expansion, 
506-507; action of, 346, 356; age of, 512-513; 
ambient temperature as cosmic clock, 504; critical 
density, 497-498; curvature of, 490-491, 526, 748; 
dominated by (nonrelativistic) matter, 495-496, 
514; dominated by radiation, 495-496; dynamic, 
489-501; early (see early universe); energy density 
of, 504; entropy of, 527; equation of motion for, 
357; equation of state of, 359; expanding (see 
expanding universe); fate of, 507-509; filled with 
constant energy density, 356; filled with perfect 
fluids, 492-493; filtered through human mind, 
779; foamlike structure of, 754, 758n; and gravity, 
778; hidden acausality of, 783; history of, 496, 
502, 503f, 515-529; homogeneity and isotropy 
problem, 531; Hubble radius of, and photon mean 
free path, 517; inflationary, 534-535; intrinsic 
curvature, 6; length scale of, characteristic, 788n; 
mass of, 747-748; obesity index of, 13; open, 629; 
open or flat, troubling Wheeler, 779; as perfect 
fluid, 231; with positive cosmological constant, 
633; scale factor of (see scale factor of universe); 
Schwarzschild radius of, 514; time evolution of, 
511f 

universes: closed/open/flat, 296-297, 491, 493-494, 
497-498; as curved spacetime, 288-300; different 
from de Sitter spacetime, 633; with different laws 
of physics, 757; Friedmann-Robertson-Walker, 
296, 491, 704; mathematical, 634; static, 509-510, 
514 

unprimed coordinates, 18, 38; metric with, 71-73 

Unreasonable Effectiveness of Mathematics in Physics, 
The (Wigner), 446 

Unruh effect, 446-447 
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upper indices, 314-316; and introduction of 
lower indices, 64; transformations in change of 
coordinates, 71-73 

ur-vector, 312; definition of, 43; with lower index, 
318; spacetime metrics, 181 


vacuum: as boiling sea of quantum fluctuations, 
745-746; restless, 436-438 

vacuum Einstein equation, solution of, 647e 

vacuum energy, 746; driving inflation, 751; 
explanation of, 752-753; in outgoing brane wave 
model, 706; proofs of, 748 

vacuum energy density, upper bound to, 749 

vacuum state, 447 

variables: dynamical, 249; “free,” in variational 
calculus, 116; in functional variations, 121-122 

variation: of action: for electromagnetism, 244, 
250-251, 380; of basis vectors, 100; compared to 
transformation, 389 

variational calculus, 155; of brachistochrone 
problem, 120; compromises in finding extreme 
values, 115; functional, 114-115; and hanging 
string, 113-123; integration by parts, 116; of 
several unknown functions, 123; solution of 
geodesic problem, 125 

variational principle: equation of motion from, 137; 
for gravity, Einstein and Grossmann, 396 

vector fields: constant, covariant derivative of, 
331; differentiation of, 100-101; index-free 
representation of, 319; introduction of, 46; 
movement through, 544; studied by observers, 
47f; visualized as fluids, 327f 

vector potential, Lorentz, 243, 248 


vector subtraction, in Euclidean space, 101, 101f 

vectors: and arrays, 51n; basic or ur-, definition of, 
43; column, notation of, 45; and construction of 
tensors, 313; contravariant, 183; covariant, 183, 
340; definition of, 39; definition of, representation 
theory, 54; differentiation, 318; displacement of, 
in curved rectangle, 341f; and indices (upper 
and lower), 73-74; lightlike, 731; Mother of All, 
312-313; notation for, 182; parallel transport of, 
101-102, 545f; projected on tangent plane, 102; 
solution of isometric condition, 586; of spacetime 
metrics, 181; on surface, parallel transport 
of, 543; and tensors, interplay of, 53-54; and 
transformations, 42; transporting via alternative 
routes, 548 

velocities: addition of, 160-161, 163, 171, 173e; 
angular: around rotating black holes, 460, 471; 
completion and promotion of, 218-219; Fermi- 
Walker transported, 193e; Galilean law for addition 
of, 19; low limit of Lorentz transformation, 169; 
measurements in trains, 166; of objects in cars, 
162-163; observed in Galileo transformation, 161; 
rotating bodies, 570 

velocity vector: of curves, 327; along geodesic, 330 

vertices, in topology, 725-727 
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Vicious and Nasty, dueling thinkers experiment, 7-9, 
8f 

vielbein: 1-form, 600; and differential forms, 594— 
606; Katuza-Klein metric, 690-691; as square roots 
of metric, 596 

VIRGO, gravitational wave detector, 577n 

virial theorem, relativistic generalization, 255 

visibility problem: in Katuza-Klein theory, 673-674; 
with large extra dimensions, 696-697 

Voigt, W., Lorentz transformation, 169n 

Volovik, G., solution of cosmological constant 
paradox, 759n 

volume element, generalized, determination for any 
curved space, 75—76 

Vulcan (predicted planet), 368 


Walker, Arthur G. See Friedmann-Robertson-Walker 
universes 

“wanting the cake and eating it too” syndrome, 751 

war, ancient art of, 103-104 

warp function, 701 

warped polar coordinates, 613e 

wave equation: derivation of speed of sound, 235; 
standard relativistic, 565 

wave function, phase angle of, in Katuza-Klein 
theory, 678 

wave guide, 694 

wave vectors, by different observers, 185 

wavelength, de Broglie, particles at Schwarzschild 
radius, 442 

waves: bulk, to brane, 703f; gravitational (see 
gravitational waves); understanding of, 783-784 

weak energy condition, 57 

weak field, 564 

weak field action, determination of, without 
Riemannian geometry, 572 

weak field approximation, for gravitational sources, 
569-570 

weak interaction, 526; CP violation in, 528, 683; 
Fermi’s theory of, 765; of massive particles, 522; 
ultraviolet completion of, 765 

Weinberg, Steven: primeval nucleosynthesis, 528; 
quantum gravity governed by attractive ultraviolet 
fixed point, 773n; upper bound for cosmological 
constant, 757; very weak version of anthropic 
principle, 752; from weak field to Einstein gravity, 
580 

Weinberg- Witten theorem, 787 

Weingarten, Julius, equation of, 106 

wet dog, effect of inertia, 276 
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Collection of Formulas and Conventions 


The following is a loosely organized list of formulas used in this text. 


$'(x') =o) (1) 
} Ox’# 
dx’! = (=) dx” = S"(x)dx° (2) 
, ax? _ —1\¥ 
Of, = ap bv = (SY 84 (3) 
ax’# 
“= age (4) 
Ox 
1, 
(S Da = Ox/P (5) 
Met Lh v dx’ v 
WHC!) = SLOW") = WC) (6) 
w! Tyee Ww scl lL —W ax" 7 
pee) = Wule(SY% (0) = Wy (7) 
/ / 10 0 He 0 ig / 10 
ds? = g! (x')dx'dx'? = g,,(x)dxldx” = Bu) ae dx'’dx (8) 
n , ox’ ox” me eee -1\Y 
8p0® ) = 8uw() To ae = Syy(x)(S ) (5 de (9) 
gh (x!) = SES 8°? (x) ta 


Infinitesimal transformation x/4 = x" + E(x) 


Boge VS Ris OS = Bp 99s" Spee (11) 
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Example of tensor transformation 


PO SST! QS) SD 


Geodesic 
d?xP dX" dx” 
— +7? (xX(l)) — = 
dl2 = jv ®) dl dl 0 
Christoffel symbol 


ae = 58 (8u8vr + Oy pr = On8 nv) 


Pye = 5 (8,809, + IS ua nx a8 pv) 


1 
a 
lo a InV8 
Christoffel symbols for polar coordinates 
1 
0 
Mog =—") V9 = _ 


Christoffel symbols for the sphere 


F cos 0 
Yr? =~ sin@ cos, eo 
ee ? sin @ 


P= SS) ST SES aS, 


kK WO 


Covariant derivatives 
DW" =0,W* +T,W” 
D,W, = 9,W, — ZW 


Covariant divergence of a tensor 


1 
DT T p74 pry. pea = py. pe 
LL m= 4, Se wh : hh ti ag eT’) + 1B . 


Sphere S* 
2 2 2 2 dr? 27/2 
ds* = d0° + sin* 6dg Ga dy 
=f 


Stereographic projection for S* 


1 
ds? = —— (dp’ + pd’) 
p 
(1+4) 


Iterative relation for sphere 


ds) = d0* + sin? Ods7_, 


d~ 


(12) 


(13) 


(14) 
(15) 


(16) 


(17) 


(18) 


(19) 


(20) 


(21) 


(23) 


(26) 
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Locally flat coordinate system 


1 
8ry(X) =" a Bae ae tt Bryno = 5009 Bey (27) 


Repu = Bry up — Bov, ur <7 Bry vp i. Bou, vt (28) 


Riemann curvature tensor 


Ro = Cae + Pep) ozs (a,F°,, + Poi ap. (29) 


Ricci tensor 


Rwy = Ras = (8,0 iy + ee Bee _ CE ae + ipa Be (30) 
Bianchi identity 

Dy Royor pe Dg Ror a Dy Rope =0 (31) 

D" (Ruy — 48uyR) = D“E,, = 0 (32) 


Einstein’s field equation and action 


Seq = is d*x./—gR (33) 
5g7? = — 378g ,yg'? (34) 
5/=8 = j= 88" 58 yy (35) 
RY — 39!”R=+8nGT (36) 
RY’ = 4+80G(T"" — 5g""T) (37) 
5Smatter = +5 / d*x/=g T"’ Sg, (38) 


Newton’s field equation 


V?® = 41Gp (39) 


GM 
800 = —1 + 29), © = —— (40) 


r 


Static symmetric spacetime 


ds? = —A(r)dt® + B(r)dr* + r°d? + r? sin? Ody? (41) 
ees ee Hs 2 a Te oe 7 Tr __rsin’6 
ro 94’ it 9B’ rr 9B’ 060 B’ gp B 2 
6 

Pro i e is =e 

ry, =— sin 6 cos 6, Dh, = cot 6 (42) 
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A” A’ A’ A’ B’ 
Ry = + 

2B rB 4B\A B 

a’ B’ A’ A’ B’ 
R= (+5) 


e re 
2A rB 4A\A B 


Re=l 1 r (4 B’ ) 
eR OBA BR 
Schwarzschild solution (rg; = 2GM) 


ds? =— (1- ‘s) dt? + 


r 


dr? + r°d0* +r? sin? 6dg? 


28S) 
r 


= (" — "s) (di + dr) (4 eau sar) +7ran 


r r—Trs 
r r r 
r= ——S _, rr = (r —rs), rm =—____§ __ 
'"  Or(r — rs) a3 s) if 2r(r — rs) 
Tj =-(r—rs), y= —(r —rs)sin’ 6, 
ae re a1 r? =~—sin@cosé rr? =coté 
alae Pop! 99 bp 


Kruskal-Szekeres coordinates 


2 4ré - 7) 2 2702 
ds? = ——e7!"s(dV* — dU*) + r*dQ 


. 1/2 t i; v2 t 
V= (= — 1) e's sinh (5) : U= (= - i) e”/?"S cosh ( 
LAS arg LS 2. 


Tolman-Oppenheimer-Volkoff equation of relativistic stellar structure 


dP GM()p(r) (1+ 7) (1+ ee (: 2GM(r) 


dr r2 p(r) Mr) r 
AI) — arp) 
d 
Kerr black hole 
2 oi). 6 2 
ds? = (: 3) dr2 — TSOTSI? atdg + ar? + pag? 
p? p? A 


2 sin2 4 
+ (" +a? + een sin? 0dy? 
p 


(43) 


(45) 


(47) 


(48) 


(49) 


(52) 
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where 
rs =2M, a== ==, pr=r+a’cos’6, A=r?+a’—rrs (53) 
rs 


Reissner-Nordstr6m black hole 


2 
ds* =— (: ees. 2) dt? + er ees dr? + 77dQ? (54) 
r r2 2M, @ 
1-44 
Perfect fluid 
TY’ = (p+ P)UHUY + Pgh (55) 
Cosmology 
ds* = —dt? +.a(t)* —t a? + r7dQ? (56) 
1- ka 
R+k= are pr? (57) 
R 4nG 
— = ———(p+ 3P 58 
ee ae (58) 
. R 
p+ 3p + P)7 =0 (59) 
= (60) 
pes q3(l+w) 
Q Q Q Q 
2_ 772 n,O _ yy2 m,0 r,0 k,0 
a Dusen = 1} (78 + at + Qa,o+ 2) (61) 
R/R 1 1 
See +5 ua + 3w)Q) = 4(2Q, + Xm — 22q) (62) 
J 
Q; =+HQ,; (2 - 1+ D+ 30)9)) (63) 
For Q, <0, 
2m = HQXy(Qyy = 2A — 1), —— Aq = HNA(Qm — 2QQ + 2) (64) 
Weak field 
1 
Suv =Nuv thy, Rwy = —5 Phu harmonic gauge (65) 
Killing condition 
Soip + Spc = 0 (66) 
Bude” + BpvIgt” + 6*A8 oq = 0 (67) 


Lees (68) 


864 | Collection of Formulas and Conventions 


Maximal symmetry 
Ryo = K (828 pv ape 8rv8pu) 
R,y =(D—DKg yy 
R=D(D-1)K 
Differential forms and vielbein 
Buv(X) = Naper (ef x) 
e= eax y 


Cartan’s first and second structural relations 


de+we=0 

R=do+ wo 
Antisymmetry 

wo, = 4% — — yb — w, 

ee ee ee 


Conformally related metrics 
Suv (x) = (x) guy 
~ 1 
Ph = Phy + 5 Olid, + 88,2 — a, 9%"9,2) 


= Dj dy 
Re vg = Re vig — (51858) — 855; 3) +8083 — Bvi8!5y) a — 


+ (26;'62 59 — 2g, ghP8E + 2e5g 808), — 2505050 + 8ygre ds — 8ie8"8;,) 


A OV oven 


D Pso po Do% Po po 
Ruy = Ry — [Cd — 2)8?6? + 8,8?) at [2(d — 2)6°8® — (d — 3)g,,.9°°] 
~ R D 0,2 (0,82) (0,2) 

= po “pe pho P i 
R= Gi 24d — 1g 33 (d-Dd—-A)g of 


with D,,d,,Q = D,D,& 
Weyl tensor 
Cuvpo = Ruvos t (d a 2)" aie + BvpRou _ SupRov _ Bva Row) 


+ ((d = Yd = 2) (8upSov — Suo8pvR 


Lie derivative 
LyW*=[V,wh= v’a, Ww" _ w’a,V" = V’D,wW — W’D,V" 


Ly Wu = VOW sa Wid," a Wivd,V" 


(82) (8,82) 


Q2 


($2) (8,82) 


Q2 


(72) 


(73) 


(76) 


(77) 


(81) 


(82) 


(83) 
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Geodesic deviation 


Deh p ax? dx? 4 
= Ss 84 
Dt? oP. dt dt (84) 
d-1 
—(X°)? + S°(x)? + (X47 =L?, de Sitter spacetime dS“ (85) 
i=1 
d-1 
(x)? — $°(x)? + (x4? = L?, anti de Sitter spacetime AdS4 (86) 


i=1 


Conventions and pesky signs 


Physicists constantly trouble themselves with conventions and pesky + signs. Of course 
it is trivial, but somebody at some point had to decide that clocks go clockwise. Relativity 
is particularly notorious for the different signs used by different authors. Evidently, each 
convention has advantages and disadvantages, otherwise authors wouldn't continue to 
keep various conventions alive. So it is futile and useless to argue about the superiority of 
one convention over another. 

In this text, we use the space dominant Minkowski metric n,,,, = (— + ++), for which 
p* =—m? for a particle of mass m. In contrast, for the time dominant Minkowski met- 
ric Nyy = (+ — ——), we would have p* =m7*. The space dominant convention is more 
common in the literature on gravity and string theory, while the time dominant is more 
common in particle theory and quantum field theory. The convention then extends to g,,,. 
To go from one convention to the other, simply flip the sign of g,,,. When we flip the sign 
Of 835 Lee does not flip sign, hence Res and R,,, do not flip, but R does. 

In flipping signatures, note that we flip both g,,, and nyg in gy, = Napenes . Hence eft 
and e“ do not flip, and so ®, and R“, do not flip signs, in agreement with the text. The 
scalar curvature R, however, does flip. 

Another key sign (called s2 below) is that in R° pon = +9 en -++, Beware that some 
authors have a minus sign here. The convention used here is such that the sphere has 
positive scalar curvature. 

As another example, consider the weak field expansion g,,, = n,, + h,,»- In flipping 
uv but not h. Since go = —(1+ 29), 
the Newtonian potential is unchanged, as it should be. The relation R,,, = —5ah ‘ap tl 


signatures, we flip g,,,, ,», and hence also h 


harmonic gauge is also unchanged. 
We define some relevant signs here: 


Nu = (- + +4) 
RE = Oey ee 
Run = S3Ro gn = 57830,0 0, +... 


RYY 5g""R = +5 25387 GT" 
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For your convenience, the signs used in some textbooks are summarized and com- 
uy and R 
are minus the R,,, and R used in this text. Thus, his Einstein-Hilbert action reads S$ = 


aac f d*x./=gR. In this text, § = toto f d*x./=grR. 


pared in the table. For example, Weinberg has s; = + and s,=-, so his R 


Sign conventions used in various textbooks 


This text + 
Weinberg + 
MTW te 
Hartle + 
Cheng + 
Schutz + 
Carroll + 
HEL - 


d’Inverno _ 


Note: MTW = Misner, Thorne, and Wheeler, HEL = Hobson, 
Efstathiou, and Lasenby. 


